Editor's note: Do you still remember the Microsoft ice that could write its own poem? Many people saw Xiao Bing's poems and wondered how creative the original machine was. Recently, the DA-GAN technology developed by Microsoft Research Asia has made it possible for machine painting to be created. As long as people describe the picture they want in words, the computer can generate a number of images matching the description within a few milliseconds. Perhaps in the near future, DA-GAN technology will open an era where everyone is a creator.
Creativity has always been considered as one of the greatest differences between human intelligence and artificial intelligence. However, with the development of technology, artificial intelligence has been continuously breaking through in the work of “creativity†in recent years. Before Microsoft's Xiao Bing wrote poetry, everyone has been amazed at the dramatic increase in computer writing. Today's DA-GAN technology developed by Microsoft Research Asia will have a significant impact on the future of artistic creation. The paper on DA-GAN has also been accepted by CVPR 2018 (click to view original article).
When people use words to describe "I want a bird with a white belly on the abdomen, a gray head on the head, and a winged white wing on the wings," the computer can generate multiple text descriptions with DA-GAN in milliseconds. Highly consistent image (see below). These computer-generated birds come alive with the bird's image, but it may be true in the real world, or it may be a bird that the system “creates†based on bird characteristics and textual descriptions. ".
The DA-GAN generated “Birds with abdomen and chest with white heads and grey wings and wings with white wings†(Note: Birds in this picture do not exist in the real world)
The greatest innovation of DA-GAN - "hidden space"
The technological breakthrough of the DA-GAN research team benefits from the development of feature expression technology. In the past, the feature expression work was mostly to let the machine understand the picture and extract features, and then classify the pictures; while the DA-GAN had some negative thinking, after extracting the features of the pictures, the features were restored in the human visual space. come out.
Take the birds listed above as an example. The system should first be able to summarize the bird's structure and characteristics according to the birds of the real world, and then output the birds they need according to the needs of the users. Birds were chosen as research objects because birds are very rich in features. There are only dozens of features on the head. Bird experts use these subtle differences to determine the species of birds, and feature richness means that Better to verify the model's ability to generate.
Fu Jianlong, a researcher at Microsoft Research Asia, said, "When training the DA-GAN system, we first let it 'see' many species of birds, just as if a person knew the red apple and saw the green apple, he could also In the appearance, it is judged that this is the same as apple. DA-GAN learned to judge bird's empirical common sense based on the bird pictures he touched."
Unlike the traditional data training mode that requires pair data, the DA-GAN does not need to map the text to the real bird, but splits the original image into different parts (temporarily refer to the sample as T). For example, head, body, tail, posture, etc., different parts are respectively projected into a "hidden space" (tentatively called the part of the generated sample T'), and then through a lot of picture training, to verify the T-T' correspondence The degree of precision, which is to continuously verify the "hidden space" is good or bad, so as to continue to iterate, to ensure that the process from the T-T' is not randomly generated, but to maintain a certain rule, so that the "hidden space" model gradually Tends to improve. This process can be said to be the core innovation of the DA-GAN system, and it is also the key point for it to be smarter and truly have the ability to learn from others.
DA-GAN Depth Attention Coding Flowchart
Next, DA-GAN can create the bird that the user wants based on this model, as described at the beginning of the article, enter your requirements, a lifelike bird will be generated accordingly. It may be a bird that actually exists in nature, or it may be a "Imagine" with the characteristics of the head of A species of birds, the physical characteristics of B species of birds, the tail characteristics of C species of birds, and arbitrary postures. In the world, and in the real world there is no such bird, but it looks like a real "bird".
(a) Text to image generation (b) Object class transformation
Fu Jianlong said, “At the moment, we only divide the birds into 4 parts. This is a feasible way for us to calculate the mapping is relatively reasonable and the cost of the system is small. Of course, we can also divide the birds into 10 and 30 parts. , then the model will be more and more accurate, but the system cost may also increase exponentially."
Enlightenment is the era of creators
In addition to birds, DA-GAN can also be used for any image-related creations. For example, previously popular small-scale programs for generating cartoon faces based on real faces, in fact, most of them just attached textures to original photos, if you use DA- GAN can be more like an artist's live comic painting. It can be Van Gogh style, Monet style, comic style, etc. Users can do any conversion.
For DA-GAN, the most important thing is early data training. The more images, the higher the quality. And its resolution has been upgraded from 64*64 to 256*256 that can be achieved by other related technologies. The increase in resolution means that the detailed information contained in each part of the picture is more complete, and it is precisely because of the richness of the details that it makes DA The performance of -GAN outperforms similar technologies in comparison with the real world.
At the same time, many new pictures generated by DA-GAN can be fed back to the system so that it can have more learning data. That is to say, as long as based on a small amount of raw data, DA-GAN can generate more "real" practice data and greatly improve the lack of real data in some areas. Using this advantage of DA-GAN, the research team achieved the industry's first increase in generating data in the bird data set, and increased the accuracy of the system by two percentage points.
Data enhancement results
In the gesture transformation task map, the first column of each group of pictures is source, the second column is target, and the third column is birds produced by DA-GAN: keep the same as the birds of the first column, but have the second column Bird attitude
In the touchable future, perhaps, DA-GAN technology will open an era where everyone is a creator. As long as your needs input it to be able to read, even if it is the objects and scenes that you have come up with in your mind, it can be “paintedâ€. The virtual world depicted by DA-GAN may not be inferior to that created by writers and artists.
Not only that, the animals and plants that have become extinct are revived on paper through the description of the written words; they provide more real portraits of criminal suspects in the security field; they help people to fit their own situation and try on clothing sold on the Internet, etc. More application scenarios of DA-GAN technology, waiting for everyone to imagine. At the same time, Fu Jianlong also stated that in the future, with the continuous development of technology, more and more technologies that can produce realistic pictures and images will be born. How to distinguish between authenticity and truth also requires scientific researchers and the public to think and solve problems.
Copper Lugs,Copper Cable Lugs,Plating Copper Cable Lugs,Copper Tube Terminal Lugs
Taixing Longyi Terminals Co.,Ltd. , https://www.longyicopperlugs.com