April 23, 2024

Apr 23

Welcome to the fourth edition of the bi-weekly report for Quarry-AI. This report summarizes metrics and key activities.

The goal of Quarry-AI is to accelerate the development of artificial intelligence and software projects originating from IU research. The team will initially focus on rapidly advancing IU research by building minimum viable products and creating common infrastructure. Quarry-AI is an initiative of IU Ventures and funded in partnership with IU alumni. The individuals working on development of Quarry-AI are currently contractors of IU Ventures and all IP resulting from Quarry-AI is owned by IU. The project is currently in a pilot phase through May 16, 2024.

Active Projects

Research on Knowledge Base Chatbot (Kelley School of Business)

This is a new project with Antino Kim, building on his existing work with UITS and supporting his research on how users perceive interactions with search and chatbots for a knowledge base
Custom chatbot built using our RAG pipeline, allowing users to ask chat-style questions of the IU Knowledge Base

Bioloop (IUI School of Medicine)

First-pass implementation of generating SQL-style queries from human-language text. Incorporates an application-specific database schema.
Evaluation of differences in CodeLlama 7/13/34b and aiXcoder LLMs

SpeechCraft.AI (IUB Speech Language and Hearing Sciences)

Supported SpeechCraft.ai team at April 13 demo at the Indiana Speech-Language-Hearing Association (ISHA) annual meeting.
Evaluating story quality for Llama 7/13 and GPT-based LLMs and image quality from Stable Diffusion models as a function of prompt engineering

Book Of Data (IUB Data Science in Practice and AnalytixIN)

RAG updates which significantly improved response quality
Awaiting feedback from the student team

AI on the IU Research Desktop (Common AI Infrastructure on IU Supercomputers)

Python-based UI allowing users to use the Quarry RAG pipeline to query their own data using natural language, all privately on IU-based systems
Working on improving result quality and offering a more chat-interactive interface

Project Discussions

The next bi-weekly AI Meet-Up is coming up this Thursday, April 25, 4 PM at the Mill.
Interested in the nuts and bolts of AI? We've been working on:

Prompt engineering for "Instruct" and base models
RAG for running queries against large document datasets (e.g., 6000+ pages of cognitive science research papers)
Fine-tuning LLMs for domain-specific applications
Building and training transformers, autoencoders, and other DNNs from scratch
The technical differences between the major publicly-available LLMs
Deploying LLMs on IU's supercomputers

If you're interested in details -- or digging into anything else AI-related! -- please let us know and we'll aim to include it as a topic on one of our Thursday meetups.

Common Infrastructure

Deployment of just-released Llama3 and aiXcoder LLMs on IU HPC
Integration of GPT-3.5, GPT-4, GPT-4-turbo, and GPT-4-32k APIs

Primarily for quality-comparison against IU-deployed LLMs

LLM fine-tuning updates

Toolkit for creating fine-tuned models based on custom training data
Assessment of result quality and training parameters

Prompt engineering and iteration

General tools support base, Instruct, Chat, and Code-based LLMs as well as Stable Diffusion image generation models

Retrieval-Augmented Generation (RAG) pipeline

Rewrote RAG pipeline from scratch, greatly improving response quality
Scripts that process the IU Knowledge Base and ingest 2000+ pages into a RAG pipeline, providing metadata-aware and high-quality information for LLM-assisted querying
Integrated with Research Desktop (RED) to allow access for all IU users in the future

AI Application infrastructure

Five different web applications now deployed providing end-user UIs for testing Quarry-based AI applications

Evaluation of frameworks for deploying LLMs on IU supercomputers

DNN engines: llama.cpp, litgpt, vLLM, PyTorch, JAX, trax, Tensorflow/Keras
Models: Llama3 8b/8b-Instruct; aiXcoder 7B; Llama2 7/13/70bn; Llama2Chat 7/13/70bn; CodeLlama 7/13/34/70bn; Mistral 7bn, Mixtral 8x7b; Grok-1; Online OpenAI/ChatGPT
Other: Lightning AI/litgpt (inference/finetune), LlamaIndex (RAG), Sentencepiece (tokenization), Fastembed (tokenization/embedding), ChromaDB (Vector store)

If you have any questions about the items above, please don't hesitate to reach out via email or on SLACK.

Robert Henschel

April 23, 2024

Active Projects

Project Discussions

Common Infrastructure

April 9, 2024