• Arthur BesseMA
    link
    32 years ago

    It appears that the captioning model on that website was trained on the MSCOCO dataset which was sourced from from Google and Bing image search, and also from Flickr.