Unlock the potential of Meta’s Llama3 for biotech applications with torchtune, a PyTorch-native library that simplifies fine-tuning of large language models. Whether you're looking to enhance drug discovery, genetic analysis, or scientific writing, our latest blog provides a guide to leverage this powerful tool effectively. 🔗 Learn more here: https://lnkd.in/g9WnrNSN #Biotechnology #Llama3 #torchtune #AI #MachineLearning
ZONTAL’s Post
More Relevant Posts
-
Because the answer was not explicit and Matt Goddeeris was curious, a follow up on how much data we need. Follow me for more explorations of Computational Biology and ML for drug discovery and beyond (we'll find out together what's beyond...) Happy computing! #datascience #machinelearning #ai #computationalbiology #drugdiscovery https://lnkd.in/ePuH3UUz
How much data, again?
http://computingbiology.blog
To view or add a comment, sign in
-
📝 Did you know that large language models (LLMs) are not restricted to generating conversational text but can also be used to analyse nucleic acid, amino acid sequences and even single-cell multi-omics data? 👷 By deconstructing pre-trained LLMs on biological data and leveraging on its internal workings, numerical representations of sequences can be obtained - these representations can then be used to decipher phylogeny, and even biological function. 🧬 Intuitively, these LLMs can be trained and used to understand the fundamental structure of protein and nucleic acid sequences by treating them as a "language", and be deployed to annotate, compare, and analyse genomes. If you are interested and would like to know more, please check out this recent review entitled "Large Language Models in Plant Biology", which I wrote with A/Prof. Marek Mutwil and Ong Xing Er. The review covers: 💻 1) Different types of language models and their usage in biology 📊 2) Transformer-based models and their anatomy 📈 3) How the models can be interpreted to gain new insights ☘ 4) How plant scientists could deploy LLMs https://lnkd.in/gX_UrA8q
Large Language Models in Plant Biology
arxiv.org
To view or add a comment, sign in
-
🔬 Navigating biological metadata with AI! In the vast landscape of biological data, repositories like the Gene Expression Omnibus (GEO), which have minimal data entry standards, pose a significant challenge for researchers. Delving into the metadata to grasp the essence and rationale of experiments often becomes an arduous task. The unstructured nature of this data demands substantial manual efforts, as scientists strive to identify potentially valuable information. 🎯 To address this, I harnessed the prowess of large language models. By Integrating GPT-4 with Langchain, I've sculpted a transformative approach to streamline metadata processing and being able to get insights quickly. To demonstrate its capabilities, I've spotlighted metadata related to NGS experiments in the context of Non-Alcoholic SteatoHepatitis (NASH). 🔗 Curious about the transformation? Dive into the deployed app: https://lnkd.in/esWzvwDa 🚀 Future developments: If this is of interest to some of you, I may think in incorporating the following features: Automating differential expression workflow. Introducing support for proteomics identification. UI improvements Refining the approach in Llama 2 for optimal cost and efficiency. Feedback and potential collaborations are all welcome. #TargetDiscovery #Bioinformatics #GPT4 #NGS #LargeLanguageModels
To view or add a comment, sign in
-
Our very important work on plant transcription factors binding sites discovery is out now. It is a revolutionary algorithm which will enable the plant scientists to accurately identify DNA binding spots for any given Transcription factor for any plant genome, whether old or totally unseen before. This work also brings forth a fact that gross neglect of immense genomic variability in plants was made where wrong practise of TFBS finding became an accepted norm, generating largely wrong interpretations and misleading findings in so many years. The software developed here, PTFSpot, will also empower the plant biologists to bypass DAP-seq like costly experiments and identify TFs and their binding preferred regions across any genome without any boundings, without any need of prior annotations, and knowledge, and with utmost accuracy. Thanks to Department of Biotechnology 🙏🏽whose funding and support helps us to make this world witness such work. Here is the publication:
PTFSpot: deep co-learning on transcription factors and their binding regions attains impeccable universality in plants
academic.oup.com
To view or add a comment, sign in
-
Biomimicry for Coding: "Romera-Paredes and colleagues’ work is the latest step in a long line of research that attempts to create programs automatically by taking inspiration from biological evolution, a field called genetic programming"
Large language models help computer programs to evolve
nature.com
To view or add a comment, sign in
-
We get by with a little help from our robot friends. Given that post-translational modifications, such as glycosylation, add a new layer of complexity to analyzing protein-protein and protein-drug interactions, the application of AI to glycobiology is necessary to understand and may predict the role of glycans in various forms of cellular behavior. Dive deeper into the world of bioinformatics with our newest article: https://bit.ly/47T5q6N #VectorLaboratories #Bioinformatics #Glycoinformatics
SugarGPT: Envisioning the Future of Glycoinformatics
https://vectorlabs.com
To view or add a comment, sign in
-
Machine learning, particularly through genomic language models (gLMs), has been effectively used to learn latent functional and regulatory relationships between genes by training on millions of metagenomic scaffolds. gLM not only captures the genomic context and protein sequence but also encodes biologically meaningful information, such as enzymatic functions and taxonomy, through contextualized protein embeddings. The analysis of gLM’s attention patterns reveals that it successfully learns co-regulated functional modules, like operons, demonstrating its potential to uncover complex relationships between genes in genomic regions.
New Harvard-Developed AI System Unlocks Biology’s Source Code
https://scitechdaily.com
To view or add a comment, sign in
-
"achieved remarkable success in various domains such as language and computer vision. Specifically, the combination of large-scale diverse datasets and pretrained transformers has emerged as a promising approach for developing foundation models. Drawing parallels between language and cellular biology (in which texts comprise words; similarly, cells are defined by genes), our study probes the applicability of foundation models to advance cellular biology and genetic research. Using burgeoning single-cell sequencing data, we have constructed a foundation model for single-cell biology, scGPT, based on a generative pretrained transformer across a repository of over 33 million cells. Our findings illustrate that scGPT effectively distills critical biological insights concerning genes and cells. Through further adaptation of transfer learning, scGPT can be optimized to achieve superior performance across diverse downstream applications. This includes tasks such as cell type annotation, multi-batch integration, multi-omic integration, perturbation response prediction and gene network inference" https://lnkd.in/dzVmmM7P
scGPT: toward building a foundation model for single-cell multi-omics using generative AI - Nature Methods
nature.com
To view or add a comment, sign in
-
We all understand that a protein's sequence determines its 3D structure, which in turn dictates its function. The interplay of sequence, structure, and function are the key to understanding every mecanism proteins are involved in, including diseases. A multi-modal protein language model that « understands » this relationship is poised to revolutionize Bioinformatics and potentially becoming the 'intelligent Swiss knife' for protein design. It’s amazing how Biology is now being demystified by matrix multiplications inside a GPU ;) https://lnkd.in/eixAj9wC https://lnkd.in/e925u_EF
Ex-Meta scientists debut gigantic AI protein design model
nature.com
To view or add a comment, sign in
-
In the fast-growing landscape of synthetic biology, the conventional trial-and-error methods often act as roadblocks, slowing our momentum in organism engineering and biomanufacturing. While these traditional approaches have served us, there's an undeniable need for more predictive and efficient strategies. Inspired by the precision and computational prowess observed in sectors like architecture and aerospace, we're poised to take synthetic biology to the next level. We're pleased to announce that our latest preprint* is now available on bioRxiv. The paper outlines the innovative mechanics behind our DNA Design Platform, a system that leverages the power of artificial intelligence to enable context-sensitive decision-making. This capability allows us to introduce a more nuanced, scalable approach to gene expression optimisation across a variety of host organisms. Our platform signifies a crucial step forward in the field. It employs advanced deep learning algorithms, setting the stage for a transformative change in the way we approach the 'design, build, test, learn' cycle. This leads to a more efficient, predictive approach that constitutes a significant leap forward in the existing paradigms of synthetic biology. -- * From Context to Code: Rational De Novo DNA Design and Predicting Cross-Species DNA Functionality Using Deep Learning Transformer Models https://lnkd.in/d3qwj4CJ
From Context to Code: Rational De Novo DNA Design and Predicting Cross-Species DNA Functionality Using Deep Learning Transformer Models
biorxiv.org
To view or add a comment, sign in