AI research

Google DeepMind to start work on autonomous AI agents

Summary Google Deepmind will soon begin researching autonomous language agents such as Auto-GPT, potentially boosting the viable applications of LLMs such as Gemini. Google DeepMind is looking for researchers and engineers to help build increasingly autonomous language agents, Edward Grefenstette, director of research at Google DeepMind, announced at X. Such AI agents already exist in …

Google DeepMind to start work on autonomous AI agents Read More »

Robots can now outsmart you in a game of tag

Summary A new method teaches robots “vision-based pursuit”. In short, robots can now chase humans. Researchers at UC Berkeley have developed a new way to teach robots strategic decision-making for dynamic tasks like playing tag. Rather than simply following a person or another robot, the robot cuts them off and actively searches for them. Learning …

Robots can now outsmart you in a game of tag Read More »

BioCoder is a benchmark for AI-generated bioinformatics code

Summary BioCoder is a benchmark designed to support the development of AI models for bioinformatics. Researchers at Yale University and Google Deepmind introduce BioCoder, a benchmark for testing the ability of AI models to generate bioinformatics-specific code. As the capabilities of ChatGPT or specialized code models grow, the models will be used for increasingly complex …

BioCoder is a benchmark for AI-generated bioinformatics code Read More »

CityDreamer creates unlimited 3D cities

CityDreamer, a generative AI model, creates unbounded 3D cities by separating the generation of building instances from other background objects. This model allows for better handling of the diverse appearance of buildings in urban environments – one of the main challenges compared to generating natural environments, as methods such as GANCraft do. To enhance the …

CityDreamer creates unlimited 3D cities Read More »

ChatGPT does years of student research in a fraction of an hour

Summary A team of researchers at UC Berkeley has successfully used ChatGPT to generate large datasets to study metal-organic frameworks (MOFs) useful in combating climate change. According to a recent study published in the Journal of the American Chemical Society, the use of ChatGPT enabled the rapid collection of data on MOFs, accelerating research. MOFs …

ChatGPT does years of student research in a fraction of an hour Read More »

MVDream creates impressive 3D renderings from text

Summary MVDream uses Stable Diffusion and NeRFs to generate some of the best 3D renderings yet from text prompts. Researchers at ByteDance present MVDream (Multi-view Diffusion for 3D Generation), a diffusion model capable of generating high-quality 3D renderings from text prompts. Similar models already exist, but MVDream achieves comparatively high quality and avoids two core …

MVDream creates impressive 3D renderings from text Read More »

New computer vision method teaches AI to say ‘no’

Summary CLIPN teaches CLIP the “semantics of negations”. This should help computer vision to recognize classes that were not part of the training data. Computer vision models recognize objects in the images on which they were trained. In real-world applications, however, these models often encounter unknown objects outside their training data, leading to poor results. …

New computer vision method teaches AI to say ‘no’ Read More »

Meta’s foundational model for computer vision is now open source

Summary Meta releases DINOv2 as open source under the Apache 2.0 license. Meta also introduces FACET (FAirness in Computer Vision EvaluaTion), a benchmark for bias in computer vision models. Update, August 31, 2023: Meta releases its computer vision model DINOv2 under the Apache 2.0 license to give developers and researchers more flexibility for downstream tasks. …

Meta’s foundational model for computer vision is now open source Read More »

Meta’s latest AI model makes scientific PDFs machine-readable

Summary Metas Nougat is an AI text recognition model that can reliably convert scientific PDFs to text. Researchers at Meta have unveiled Nougat (Neural Optical Understanding for Academic Documents), an AI model that converts PDF images of scientific articles into structured, machine-readable text. Nougat aims to bridge the gap between human-readable PDF documents and machine-readable …

Meta’s latest AI model makes scientific PDFs machine-readable Read More »

AI gets much better at reading text in images

Summary BLIVA is a vision language model that excels at reading text in images, making it useful in real-world scenarios and applications in many industries. Researchers at UC San Diego have developed BLIVA, a vision language model designed to better handle images that contain text. Vision language models (VLMs) extend large language models (LLMs) by …

AI gets much better at reading text in images Read More »

Scroll to Top