Art history’s downfall may have begun with cleavage—specifically, cleavage from Jennifer Lopez’s green Versace silk chiffon dress at the 42nd Grammy Awards ceremony on February 23, 2000. Desperate to take a peek, droves of fans typed “J.Lo Grammy Dress” into Google but were disappointed, as searching for images had yet to be invented. In fact, former Google CEO Eric Schmidt later revealed that the abundance of searches for the dress inspired Google to create Google Images, and by July 2001, they’d succeeded. Twenty-one years later, the pioneering concept of publicly available and searchable photos has amassed over 1.12 trillion images for the Internet while shaping visual comprehension for humanity and robotics. Through the breadth of data we’ve uploaded, shared, created, and forgotten, we’ve created a new formula for image accumulation and, unknowingly, developed a vast training dataset for AI researchers and their algorithms to use.
It’s taken years to arrive at the current processing power of AI text-to-image generators. For the most part, the labeling of images has been the most significant contribution, taking tremendous man-hours for incremental change. In 2007, computer scientists at Stanford and Princeton began classifying images derailed by cursory labels that general users had simply attributed to their photos of “cat.jpeg.” They realized the need for human-based input to create sophisticated image captions for classification programs to streamline searching capacity. (Therein lies the difference between “cat” and “fat, fluffy, white cat