Uncovering GenAI trends: Using local language models to explore 35 organizations

Stefan French

from The Mozilla Blog on 2024-06-10 13:00 (#6NDQ8)

(To read the complete analysis as well as the results of each language model, visit the mozilla.ai blog here.)

Over the past few months, Mozilla.ai has engaged with several organizations to learn how they are using language models in practice. We spoke with 35 organizations across various sectors, including finance, government, startups, and large enterprises. Our interviewees ranged from machine learning engineers to CTOs, capturing a diverse range of perspectives.

To analyze these interviews, we used open-source local language models running on our laptops. The analysis confirmed the trends we had anticipated during our interviews and shed light on the differences in each model's presentation of said trends.

Objective: Help shape our product vision

Our primary aim was to identify patterns and trends that could inform our product development strategy. Despite the unique nature of each discussion, we usually focused on four critical areas:

LLM use cases being explored
Technology, methodologies, and approaches employed
Challenges in developing and delivering LLM solutions
Desired capabilities and features

Data collection & model selection

After each conversation, we wrote up summary notes. In total, these notes for the 35 conversations amounted to 18,481 words (approximately 24,600 tokens), almost the length of a novella. To avoid confirmation bias and subjective interpretation, we decided to leverage language models for a more objective analysis of the data. By providing the models with the complete set of notes, we aimed to uncover patterns and trends without our pre-existing notions and biases.

Given privacy concerns, we decided to keep the information local. Therefore, I selected a set of models that I could run on my MacBook Pro M3 (36GB) locally. Here's an overview of the models and configurations used:

Model	Parameters	Quantization	Size
Llama-3-8B-Instruct-Gradient-1048k	8B	Q5_0	5.6GB
Phi-3-medium-128k-instruct	14B	IQ3_M	6.47GB
Qwen1.5-7B-Chat	7B	1_5	5.53GB

There are a number of options to run LLMs locally, such as ollama, lm-studio, and llamafile. I used both lm-studio and llamafile (an in-house solution by the Mozilla Innovation Team).

Summarizing with local language models

The prompt used to generate model outputs was: Summarize the following information to get the key takeaways about developing LLM solutions in 10 bullet points. Take the full information from start to finish into account. Never use company names or an individual's name. [Full notes]"

To read the complete analysis as well as the results of each language model, visit the mozilla.ai blog here.

Key takeaways

I was impressed by the quality of the responses from these models, which were all capable of running locally on my laptop. They identified the majority of trends and patterns among the 35 organizations we studied. Each model also highlighted unique insights and communicated in different styles:

Llama-3-8B-Instruct-Gradient-1048k emphasized the main LLM use-cases that were discussed and the difficulties moving from prototype to production. The style of the sentences generated can be quite long.
Phi-3-medium-128k-instruct picked up on the reluctance of many organizations to finetune models. Its style feels more conversational than the other models.
Qwen1.5-7B-Chat highlighted the lack of technical expertise many organizations suffer from. Its style is more concise and straightforward, similar to the style of chatGPT.

Across all the models, three key takeaways stood out:

Evaluation: Many organizations highlight the challenges of evaluating LLMs, finding it time-consuming.
Privacy: Data privacy and security are major concerns influencing tool and platform choices.
Reusability and customization: Organizations value reusability and seek customizable models for specific tasks.

This exercise showcased how well local language models can extract valuable insights from large text datasets. The discussion and feedback from our network and end-users will continue to guide our efforts at Mozilla.ai, helping us develop tools that support diverse use cases and make LLM solutions more accessible and effective for organizations of all sizes.

The post Uncovering GenAI trends: Using local language models to explore 35 organizations appeared first on The Mozilla Blog.

Source	RSS or Atom Feed
Feed Location	http://blog.mozilla.com/feed/
Feed Title	The Mozilla Blog
Feed Link	https://blog.mozilla.org/en/