If we were to produce a manual to navigate the social world, how would we depict the complexity of our interactions, step-by-step? What instructions would we write to explain social cues and body language? translation : the act or process of changing something from one form to another [1b] : a change to a different substance, form, or appearance : conversion In a way, our predictions of a future in which we interact with machines on the daily basis became true; but that doesn’t mean that in the 21st century, human-machine interactions happen smoothly and flawlessly. Our computers, alarm clocks, mobile devices and digital cameras are reasonably complex devices that require specifications from us to work properly. For that reason, an instruction manual is supplied with most of them. What ever happened to the days in which owners read the manual before starting to use a device? Perhaps manuals have been replaced by common sense and empirical knowledge—and in some cases extremely good design. Yet, owners’ manuals involve a complex process in which the operation of a machine, along with its components and structure, need to be translated into a visual form that comes across in a simple way for all users. Some owner manuals are multilingual, so that a same product can be sold across different markets. In such cases, the cultural biases that translations are subject to must been taken into consideration to ensure clarity and leave no room for misinterpretation. Corporations like IKEA have taken a language-free approach on their assembly manuals and rely on diagrams and numbers to communicate their message, although numbers—and more specifically arithmetic—is a language with its own rules and syntax. In any case, the act of translation almost always involves a certain degree of loss: loss of information; a deformation. Just as we have experience dealing with machines, we also have experience dealing with humans. Yet, humans—highly complex in their structure and operation—do not have manuals that explain how to troubleshoot in case of a conflict or unexpected situation. Instead, we come up with different ways of letting others know how we feel without the use of words. We rely then, on social cues; body language, facial expressions, and the tone of our voice. In this context, translating social cues is a complex process that gets even more difficult when there is an underlying message that is not overtly communicated. If we were to produce a manual for the social world, how could we summarize the complexity of our social interactions on a step-by-step basis? Would we solve our problems better? Furthermore, how would we translate body language and gestures when these are highly influenced by culture? Instructions for humanity is a playful exploration of the relationship between image and text, in a context in which both have to support one another. Photography and instructions taken from real manuals are extracted from their original environments, de-contextualized, and juxtaposed onto one another to produce an “anti-manual manual”. In a way, breaking the rules of logic and common sense shines some light on issues that constantly arise in a world in which cultures, languages, behaviors and communication styles clash.
Over the last decade we have witnessed unprecedented innovations in the area of Artificial Intelligence (AI) with advantages and implications for many other fields. Since its boom a few years ago, much has been speculated since about its widespread use and potential implications. Beyond sci-fi-fueled narratives, it is undeniable that AI plays an increasingly relevant role in our society as we interact with algorithms more often than we might be aware of—and in most cases, in more complex, yet less visible ways. As AI gains momentum in our culture, we are compelled to think about the way these technologies affect almost every aspect of our world; including how we make business, how we think about our jobs and how we go about our daily lives. Even though we are surrounded by algorithms, their inner workings have remained inaccessible to artists and designers until very recently—partly due to initiatives such as Google’s AI Experiments and Stochastic Labs’ Art Breeder, which make it possible to explore the image-making capabilities of sophisticated machine learning methods such as Generative Adversarial Networks (GANs). Similarly, websites such as Deep Dream Generator allow anyone to remix images through algorithms that reinterpret two different images to create a new one. The open-access nature of these projects have made it possible for people outside academic spheres to experiment with AI, share their results and process online, and be inspired by each other. In this context, I set out to explore the creative possibilities of incorporating AI algorithms into my work. My personal experience joins that of other artists and designers who have learned about the transformative effects they can have on the creative process, and the opportunity of embracing uncertainty as creative fuel. Gene-editing is the artist’s brush Using a technique dubbed “gene-editing”, Artbreeder’s image-making process is centered on mixing the genes of millions of different images to generate an entirely new one. However, instead of choosing specific images, the artist determines categories (e.g. structures, small objects, animals) to add genes from and generate a new image. Intentional about what influences to bring to the piece, but not precisely knowing what the outcome will look like after a new gene is added, the artist needs to constantly align each new result with their artistic vision, developing a new creative intuition in the process. Over time I learned that to add a specific color, texture or shape, I had to think in terms of objects that would contain those genes. To obtain a bright green, for example, I would incorporate a reptile gene, generate a new image, evaluate the results and align them with my artistic vision. For a bit of structure, add a chair. Generate a new image. Evaluate, tweak genetic information, repeat. Less toad, more chair. The relationship between objects and their visual properties is strongly wired in the mind of artists and designers—when we talk about color and suggest a tomato red instead of scarlet, we are making that connection. Over time I learned that to add a specific color, texture or shape, I had to think in terms of objects that would contain those genes. To obtain a bright green, incorporate a reptile gene, for a bit of structure, add a chair. Evaluate, tweak genetic information, repeat. Less toad, more chair. This process makes use of that relationship and pairs it with tremendous computational power. Thus, each possible image and their respective visual properties also become part of the genetic mix. So that adding a tomato gene would not only add bright red to the image, but also pass down other genetic material (visual properties) such as a plump, round shape and a smooth, glossy texture along with other likely genetic possibilities, such as seeds, knives and kitchen counters—all with their respective visual properties and artistic possibilities. This offers a glimpse of the algorithm’s inner-workings, shining light on a computational power that undoubtedly surpasses that of the human brain. AI-assisted art is born. This image-making method encourages the artist to consider several visual properties simultaneously to inform the next step, and graphic designers are particularly primed for this level of visual thinking. In tweaking how much influence each gene has on the final piece, the artist retains creative agency, but is not fully in control. In this relationship, the artist is intentional about what influences to bring to the piece, while also responding to the algorithm’s results; so the algorithm is also influencing the artists’ vision. In Chimera, every creature is a new genetic experiment. Their appearance reflects the eclectic nature of their genetic make-up: part reptile, part building, part beehive. AI-assisted art vs AI-created art In this particular situation, the algorithm is not autonomous; this process requires a specific type of visual cultural intelligence that only humans can bring. The images that emerge are ambiguous and unpredictable, thus uncertainty and surprise become powerful creative drivers, allowing the artist to create images in a different way. The powerful connection between uncertainty and creativity has been written about extensively before. In her essay “Uncertainty as a Creative Force in Visual Art”, Sasha Gershin (2008) highlights the connection between uncertainty and artistic creation: “Generally, certainty is identified with sound academic practice based on complete knowledge, where the outcome is predictable, while uncertainty belongs to the realm of incomplete knowledge and implies a surrender to chance. Here the outcome is less predictable and this uncertainty can be conscripted as an active collaborator within the process of art making.” (Gershin, 2008) While Gershin talks about a collaboration between artist and uncertainty, in this particular scenario another possibility is added to the mix: that of an active collaboration between machine and human, which has been explored before with various degrees of success. The idea of a true creative collaboration implies that both parties—human and machine— have equal degrees of creative agency, thus assuming algorithms are capable of artistic expression. Artist and researcher Holly Herndon recently explored this idea through PROTO (2019), a musical project created with an AI entity trained to learn and generate music through call and response. Similarly, the Paris-based art collective Obvious (whose AI-generated portrait Edmond de Belamy sold for $432,500) believe machine creativity is possible: “We found that portraits provided the best way to illustrate our point, which is that algorithms are able to emulate creativity.” (Caselles-Dupré on Bastable, 2018). Likewise, computer scientist and researcher Ahmed Elgammal and his team at Rutgers University have worked on developing a machine learning model that focuses on creativity, called CAN (Creative Adversarial Networks) which introduces a change in the GAN algorithm to replace similarity with novelty: “The system generates art by looking at art and learning about style; and becomes creative by (…) deviating from the learned styles.” (Elgammal et al., 2017). Far beyond refining an already sophisticated tool, the aspiration Elgammal and his team are trying to fulfill is that of a true human-AI collaboration. The difference between AI-assisted art and AI-created art relies on that very question of agency. In this particular process, creative autonomy, although influenced by the algorithm, is at all times retained. AI, culture and context Unlike computer science, art and design have the ability to engage human emotions in their process, thus aiding in the exploration of social issues. Similarly, due to their nature, both disciplines offer great tools to engage in conversations and speculate about the role of AI-assisted and AI-generated images in our culture. Womaness is a reflection of what mainstream media has established as femininity: pink as predominant color, voluptuous silhouettes, blobs of nude hues everywhere. Unlike my chimeras, the images in this series are very similar in their genetic make-up, and specific genetic variations were made to add meaning and cultural context. The process, although similar in practice, differs to that of Chimera in that, instead of focusing on visual properties such as color and texture, I explored the semiotic nature of stereotypical symbols of femininity (lipstick=women; dress=feminine, etc.) The algorithm helped me create thousands of unique images made up of genes from images of categories such as “lipstick” and “brassiere”. The resulting images bear a vague and uncanny resemblance to what we have defined as feminine. What we see is not the algorithm’s judgement, but a direct reflection of our visual and cultural landscape. The predominance of a very limited (and very light) gamut of skin tones, confirms our well-known issues of diversity and representation (or lack thereof) in our mainstream media. The resulting images bear a vague and uncanny resemblance to what we have defined as feminine. What we see is not the algorithm’s judgement, but a direct reflection of our visual and cultural landscape. Embracing a new aesthetic The images created by algorithms look wild and untamed; chaotic yet vaguely familiar—one could say, a version of our world from an algorithm’s perspective. There is undoubtedly creative expression, but is it only that of the artist’s? The forms, blurry and undefined, have a raw quality to them and, in the era of photoshopping, it is tempting to want to post-process the images to fix blemishes and polish imperfections. The decision to leave the images largely untouched goes beyond embracing imperfections to understanding they are an integral part of the image. In this context, the presence of glitches and imperfections in the image reveals its algorithmic origins. The images created by algorithms look wild and untamed; chaotic yet vaguely familiar—one could say, a version of our world from an algorithm’s perspective. There is undoubtedly creative expression, but is it only that of the artist’s? As a new image-making method, algorithms and neural networks bring about new aesthetic realms and remind us that there’s beauty and delight to be found amidst chaos. By opening up to a new process, in which we relinquish some control, we get rewarded with dreamy, yet uncanny resemblances of our reality. Many of these images vaguely resemble the idea of a landscape, a person or an object, but with elements too complex to be human. While their dream-like qualities could be linked with influences like the Psychedelic era from the late 60s and early 70s, there’s something human yet so mechanical about how these images are rendered. Some are mesmerizing, others are uncomfortable, and many others are absolutely otherworldly reminding us of human experiences as extreme as altered states and childhood pass-time activities such as cloud gazing. In computer brains, such imagery suggests that artificial brains are more human than they may seem. “The fact that humans report that Google’s Inceptionism looks to them like what they see when they hallucinate on LSD or other drugs suggests that the machinery ‘under the hood’ in our brains is similar in some way to deep neural networks.” These images remind us there's beauty and delight to be found amidst chaos. It is also a recognition that creativity is ever evolving, and changes with the times. By evaluating our current understanding of creativity, we attempt to define what makes it intrinsically human, and how it is also subject to technological change. In conclusion, AI-assisted art shines a light on how human creativity is ever evolving, changes with our cultural context and is constantly affected by our technologies. Like all other novel technology we have learned to embrace in the past, we are in the very early stages of figuring out how big of an impact AI can have in our world. Along its multiple applications, multiple implications unfold simultaneously. Instead of elaborating on dystopian or utopian visions of the future, let's explore the implications of technology and the role design plays in this context. #AI #AIAssistedArt #creativity #algorithms
Algorithms, glitches and reality: a visual cultural analysis
The website thispersondoesnotexist.com (launched in February of 2019 by Phil Wang) uses Artificial Intelligence (AI) to generate human-looking faces. Using a machine learning process called GAN (Generative Adversarial Networks), the algorithm is able to learn from a dataset of images and generate completely new and original faces, incorporating endless variations with unique features such as pose, face shape and hair style, among other elements. Since its launch, the website quickly became viral, amassing critics and followers alike, as it showcases what is possible with the current technology. The AI-generated images by this website are the focus of this visual and cultural analysis. Why should we study AI-generated images? Are they worthy of scrutiny outside the field of Computer Science? I argue that the moment they become accessible to virtually anyone around the world, they are subject to study. Furthermore, we understand that images are cultural products, and as such, they are worthy of study. Obviously, not all images are the same or have the same importance; but we know from living in an image-driven society that they matter often in more than one way. According to Wang, the website was created to “(...) call attention to A.I. 's ever-increasing power to present as real images that are completely artificial.” (Paez, 2019). In a society where pictures and images are the standard surrogates for proof, this website demonstrates that this technology (which automates the work that once required painstaking labor on the part of imaging experts) could be both revolutionary and dangerous. In order to unpack the cultural implications of these images, I will use Prown’s methodology, starting with a single image retrieved from the website, and then incorporating observations taken from a set of 60 additional images taken from the same site. I. Description On an elemental level, all of these images are made out of pixels, however, what makes their existence possible is the algorithm, which generates a new image every time a user accesses the website. Each image is unique and accessible only once, because as soon as the browser is refreshed, a new one emerges. Even pressing the Back button in the browser will generate a new image. In that sense, these images are ephemeral, unless saved to a computer’s hard drive, which can be done by pressing the Save button on the lower right corner of the screen. In spite of the aforementioned inconsistencies, the faces in this website are so realistic that it’s easy to forget they are computer-generated. Many sport accessories of all sorts; from jewelry to sunglasses and somewhat ambiguous head pieces (Fig 2). Others give indication of garments such as shawls and t-shirts, while others even mimic fabric textures such as velvet, satin, lace and floral patterns—to various degrees of success. In most images one is able to identify highlights and shadows consistent with our real-life experience of taking pictures under different light sources (e.g. natural sunlight vs flash). This is particularly interesting since it leads the viewer to make distinctions such as indoor and outdoor pictures, in spite of knowing that no light source, camera, photographer or subject were involved in the making of these images (Fig 3). Furthermore, the background and colors in some images suggest scenes and contexts; such a winter scene, or a family gathering. It is only upon closer inspection that one starts to notice inconsistencies; glitches. And while the majority of these glitches do not necessarily make the face look less human or less real (except for a few examples) they do remind us of their intrinsic technological origin. II. Deduction Interacting with the algorithm There appears to be more than one way to interact with the algorithm. The first way is through the website, which promptly greets the user with an image. For what seems to be a long time (four seconds), the user is forced to pause and engage with the image, as the page doesn’t have a title or navigation menu. For a brief moment the interface consists only of a gray background and the image covering most of the screen. In the absence of buttons and UI elements, the user has no other choice but to stare at the image. The image, in turn, greets and confronts. The faces are almost always friendly and familiar—some are even attractive, thus captivating the viewer, and at the same time, prompting many questions. After four seconds, a window appears on the lower right corner of the screen (Fig 4). It is a stark grey box devoid of any styling, text-heavy and cluttered. The only colors being used are black for the text and blue for hyperlinks. Although this was clearly designed to provide information without being distracting, the kind of information it provides is obscure as it doesn’t clearly explain the origins of the image. The first line of the box reads “Imagined by a GAN (generative adversarial network)”. Here, the use of the word imagine stands out. For anyone who isn’t familiar with this jargon (GAN, StyleGAN2, Nvidia, AI) all this information is meaningless and says very little about the image. The fifth line reads: “Don’t panic. Learn how it works”. This assumes discomfort from the user, anticipating this is something people usually feel confused by. Although the gray box offers many choices, clicking on any of those links would interrupt the user experience as all links open in new windows. There is also the option to see Another and to Save, each offering a new form of interaction. Here, to click Another is to re-engage in the experience, feeding back into the loop; whereas to Save the image is to capture it, to contain its ephemerality, and to some degree, materialize the algorithm. But this containment destroys its very essence; becoming now just an image among many others in one’s computer. Without its original context it is now stripped from meaning; the link between the image and the algorithm (its origin) is now untraceable. Emotional response The algorithm itself is faceless, however the images it imagines are highly recognizable, many of them look like familiar faces we’ve seen before, or are about to encounter on our next trip to the grocery store. As I observe the new image on my screen, I can’t help but wonder whether there is actually someone out there in the non-virtual world whose face closely resembles the one imagined by the machine. I’m intrigued. I click Another. A very different, and equally compelling image appears. I start asking more questions, trying to guess their age, their name, their story. But then I realize: these are non-humans resembling humans. There is no story, there is no name, there is no age. Nevertheless, I am caught in a loop of instant gratification; one image after another. I keep clicking, and it becomes an automatic process. A new face with a familiar expression emerges, sterile, yet unthreatening. I am interacting with a face-generating machine, and the interaction quickly turns into a game in which I look for cues to determine the authenticity of the image presented; I look for glitches. The more glitchy the image, the less authentic. In this case authentic = human while unauthentic = computer-generated. As I hunt for glitches, I progressively get attuned to the smaller details and observe what the texture of the skin looks like; I notice inconsistencies and elements that seem out of place, like a woman wearing just one earring, or a person who appears to be wearing a jacket only on one shoulder. I click Another and look for another glitch, but I can’t find any. As it turns out, some of these images have no glitches at all. My ideas of authenticity and reality are suddenly challenged. As a designer I ask myself: how can I interact with these images? I can start by building a proto-taxonomy to classify them. As I begin to sort, categorize and group these images I realize I’m approaching this task with a cultural mindset where I look for binaries; for example clustering female and male faces and then noticing those that look very androgynous, and creating a new category for them. Thinking about age, for example, one can categorize by children faces, adult faces, teens, etc. The way in which I try to make sense of these images reveals a lot about how I approach the world, and the biases and preconceptions that I carry with me. Even though it might be tempting to assume these images are empty and unexpressive because they don’t have a story, they end up telling us more about ourselves than we might have imagined. Since these are not real human faces, does this mean they are exempt from the laws, rules and regulations other images of (real) humans are subject to? Should computer-generated images of children, for example, be treated in the same way as images of real children? What ethical and moral concerns regarding identity and representation apply to these computer-generated images? As I ponder these questions, I realize that, although it might be too soon to tell how these images alter the creative process, they are already making me think differently about how I use images in my work and the various levels at which designers engage with images. III. Speculation Seeking, embracing, and avoiding glitches The glitches found in these images usually manifest as irregularities on the background and skin texture, interruption of patterns, amorphous blobs of color and texture and overly-asymmetrical faces, among others (Fig 5). For the computer scientist who constantly strives to refine these technologies, the goal is to achieve a more convincing and realistic image, thus these glitches are undesirable. Conversely, for the lay viewer who is confronted for the first time by hyper realistic, computer-generated images, glitches are desirable, as they allow her to verify the authenticity of the image presented, i.e. to discern human (real) faces from non-human (fake) faces. In this context, the absence of glitches conceals the origin of the image, thus, the glitchier the image, the more transparency there is about its origin. While one definition of glitch is related to a “minor malfunction” (Merriam-Webster, 2019), another definition is related to authenticity, as in “a false or spurious electronic signal” (idem). In a study conducted by Lehmuskallio and colleagues in 2019, the researchers set to test whether imaging experts (professional photographers and photo editors) would be able to determine if the images presented in the study were real photographs or computer-generated. In trying to determine the authenticity of said images, those that were deemed too perfect by the participants often raised suspicion: “In contrast to the ways in which our research subjects spoke of photographs, computer-generated images were discussed particularly via opposition to any claims to authenticity. Whereas photographs were described with terms such as ‘authentic’, ‘natural’, ‘true’ and ‘trustworthy’, computer-generated images were considered to be artificial, unnatural, made, depicting a parallel reality and too perfect.” (Lehmuskallio et al., 2019) Here, the absence of glitch is suspicious; its presence is desirable, almost reassuring, in a way creating an aversion to perfection: the less convincing these images appear, the more we trust them. In light of the ever-increasing sophistication of these images and their respective technologies, we must find new ways to tell truth from fiction, so we look for flaws. In this scenario, glitches elicit an amount of scrutiny that we might not employ when looking at another “normal” image. Is hyperawareness how we will fight forgery and fallacy? On the ever changing role of the glitch Ever since we first incorporated technology into our lives, we have learned to co-exist with glitches. Thus, the significance of the glitch in our culture has evolved along with our technologies, as we have historically manipulated glitches for various purposes. For decades, glitch artists have extensively explored the expressive capabilities of glitches. On some glitch art, the artist’s role is to set up situations in which errors manifest (Barker, 2011), while in others, the artist purposefully misuses technology to produce them. In his work, glitch artists like Zach Nader purposefully misuses tools in Adobe Photoshop (such as the Content Aware tool, which is powered by AI) to create exaggerated, distorted and chaotic effects. In other cases, glitches have been introduced to make a technology appear more authentic (human) and relatable. One example is Google Duplex, which is an "AI system for accomplishing real-world tasks over the phone on behalf of humans", such as calling to make a reservation at a restaurant. A peculiarity of this system is that, during its conversations with humans, Duplex incorporates speech disfluencies (speech glitches such as “ahs”, “umms” and “mm-hmm”) to make the voice assistant more relatable and, arguably, improve interaction. Because of these human-like traits, the recipients of the calls are less likely to suspect they are talking to a robot. While glitches have been sought and embraced, in other situations they have been avoided and feared; as exemplified in the article “Computing glitch may have doomed Mars lander”, whose titles carries a negative connotation, which may imply important loss (financial, technological and otherwise) caused by a glitch. What are the cultural implications of AI-generated images? It is clear that, as stated by Wang, there is a need to raise awareness of what is possible with these images, including positive applications as well as the dangers of its misuses. The main concern is that this website uses the same technology as deepfakes, which are computer-generated images superimposed on other pictures, videos, or audio, often used to pull hoaxes in which a person appears to do or say things they never did or said. So far, deepfake videos have appeared in pornography and satire, but it’s almost certain they’ll be used for other purposes beyond entertainment and commentary. Because these techniques are so new and increasingly sophisticated, people are having trouble discerning truth from fiction, thus we are morally compelled to ask how these algorithms can be used for social good. These images also have consequences for our visual cultural landscape; not only because they are based on the human physique, but also because they are based on human cultural products. In this sense, all of these faces are a direct representation of us—physically, culturally and intellectually. One can interpret them as computer-generated renditions of what our world looks like right now. Judging by the hairstyles, jewelry and accessories, we can assert that these images look decidedly contemporary; one could reasonably place them anywhere in this decade. As we modify our bodies and appearances through methods like extreme makeup, and plastic surgery, images of overly-contoured, overly-stretched faces are more commonplace in our media landscape (particularly among celebrity culture). Will new computer-generated images incorporate this post-human aesthetic? And, can these computer-generated images be regarded as cultural evidence of our times? In our visual experience, faces are linked to identities, and identities are infinitely complex. In this context one would be tempted to question the diversity of these images and their potential consequences for issues of identity and representation. Obviously, the answer to the diversity issue has to do with the type of images the algorithm was trained on, and whether there were any considerations made by the developers/researchers in this respect. But, do these images have to be diverse? These images reflect the technology-driven culture in which it was born, the culture of the humans involved in the algorithm’s creation, and that of the people whose faces the algorithm trained on. Thus, the range of people represented in them not only reflects what is relevant in our society, but who is relevant in our society. What real function these images will have in the near future is yet to be determined, but we should evaluate them for their current function, which is that of reflecting our ideals and values, thus, whose faces are depicted in them matters. References Gibney, E. (2016). Computing glitch may have doomed Mars lander. Nature, 538 (7626), 435-436. glitch. 2019. In Merriam-Webster.com. Retrieved Dec 15, 2019, from Barker, T. (Oct. 2007) "Error, the Unforeseen, and the Emergent: The Error and Interactive Media Art," M/C Journal, 10(5). Retrieved from Krapp, P. Noise channels: Glitch and error in digital culture (Electronic mediations; v. 37). Minneapolis. University of Minnesota Press. Leviathan, Y. et al. (2018, May 8). Google Duplex: An AI System for Accomplishing Real-World Tasks Over the Phone. Google AI Blog. Retrieved from Lehmuskallio, A. et al. (2019). Photorealistic computer-generated images are difficult to distinguish from digital photographs: a case study with professional photographers and photo-editors. Visual Communication, 18(4), 427–451. Nader, Z. (n.d.) [Website] Paez, D. (2019, Feb 21). This Person Does Not Exist Creator Reveals His Site's Creepy Origin Story. Inverse. Retrieved from Wang, P. (2019). This Person Does Not Exist [Website]. Retrieved from