And another relevant news. Hugging Face reproduced DeepMind Flamingo (https://hottg.com/gonzo_ML/941), a model that combines a pretrained vision encoder with a pretrained language model.
They plan to open-source this work soon.
More details: https://www.linkedin.com/posts/victor-sanh_multimodal-llm-deeplearning-activity-7038583909994885120-BjsF
>>Click here to continue<<