Last week Dr. Elizabeth Childs and I had the privilege to do our first presentation about the Open GenAI project. A big thank you to the Canadian Association of Research Libraries for the invitation to come and share about the project. If you would like to watch the recording from the session, you can do so here. If you would like to review our slide deck from our presentation, you can find it here. We have a few more presentations coming up over the next month so we will be sure to share recordings here once they are available.
This past week I also reached out to the authors of the 25 open textbooks we will be using for Opterna, our AI study companion. This week I met with some of those folks, and it was really exciting to hear how interested folks are in the idea of an AI study companion with their open textbooks. The longer the project goes the more I am learning about both the technical considerations for working AI and some of the tensions that we often face with openly licenced resources.
As I have mentioned before, we are only committed to the development and testing of a prototype and whether we host Opterna beyond the project will be informed on what we hear at the focus groups we will do in 2026.
A couple considerations that I am already contemplating are;
1. Cost and stats, open textbooks get a lot of traffic, but not all traffic is actual learning, bots mess with our stats, so it won’t be until we are in the testing phase before we have a true sense of cost to host the tool.
2. The other tension we will have to navigate is the fact we are using an existing, pre-trained LLM so the tool is not being “trained” so it will not be going out beyond the 25 textbooks we have chosen. The pro for that approach is less inaccurate information in the responses Opterna generates. A challenge we already know we will face is that open textbooks will eventually become outdated so deciding how to know when it’s time to remove a textbook and replace with a more current version. We will need to understand what that work entails and whether we have capacity to manage that.
Anyways, that is it for now, we are having lots of fun learning, engaging with folks, and we are looking forward to lots more conversations in the near future.
We are now a few months into the Open GenAI project so I thought I would provide a few highlights of the work that has happened so far.
We are excited to welcome Dr. Elizabeth Childs onto the project who will help us fulfill the deliverables and bring a pedagogical perspective to this work. Elizabeth comes with a long history of advocating and working with open educational practice and as the Program Head for the Master of Arts in Learning and Technology at Royal Roads University has a deep understanding on learning and teaching with technology.
We are excited to be working with UBC Cloud Innovation Centre (UBC CIC) who will be working with us on the development of the AI powered study companion prototype. One thing I loved about the idea of working with UBC CIC was the involvement of working with co-op students in the development. Development of the prototype is just starting, and we plan to have it completed by the end of December. The prototype will be openly accessible on GitHub once development is complete.
As part of our continued efforts to involve students in the work, we recently worked with the BC Federation of Students (BCFS) to host a student focus group to learn from students some of their experiences, perceptions, and challenges in the use of AI in their learning journey. Once we have the prototype developed, we are hoping to work with the BCFS to get more students engaged in the testing. If you have students who you think would like to be involved in the testing, please reach out to rdevine@bccampus.ca.
Elizabeth and I are planning do some engagement sessions with educators to learn more about their experiences, perceptions, and challenges of using AI in their learning and teaching practices. We will be inviting educators into the testing phase to get their feedback on the tool and start to build awareness for educators who are using open textbooks on their courses.
Hearing from students and educators will help shape our approach to the design of the AI powered study companion we are building with UBC CIC. This will also help inform what other resources we can provide when working with open GenAI tools as we look at years two and three for this project.
Research is a big part of this work. Harper continues to dig deeper into the use of offline generative AI tools and smaller models as a way to address environmental, privacy, and ethical concerns from the post-secondary sector, and Elizabeth is exploring doing some research to share our experience, learning as part of a way to share back about this work to the larger open community. Join us at our session for the Digital Learning Strategy Forum and watch the blog for updates on how you can get be part of this action research project.
By now, most are familiar with online generative AI tools like ChatGPT, Gemini, and Claude. They offer users the ability to brainstorm ideas, revise papers, generate code, and more with just a few keystrokes. But did you know that you can have those same capabilities by running generative AI locally on your computer, even without internet? In this blog post, we’ll highlight common reasons generative AI locally might be the right choice for you and how to run models on your computer locally in a step-by-step guide on installing and using GPT4ALL.
What does it mean to “run generative AI locally”?
When I say “running generative AI locally”, I am referring to the practice of using generative AI models that I have downloaded directly on personal devices such as smartphones or laptops, rather than relying on distant cloud servers, as is the case for online tools like ChatGPT.
Why run generative AI locally?
Data and privacy: Centralized services like ChatGPT store your data in their servers, which can include everything from your chat history to device information, and they are allowed to use this data to train future models if you don’t opt out. Further, you do not have control over who sees your data, how it is stored, or how they manage it beyond the options they provide you. This poses major privacy concerns, especially in the post-secondary or proprietary context. When running generative AI locally, all your data is stored locally on your computer and this minimizes the risk of your data being used, stolen, or sold without your consent.
Environmental concerns: Even when these services are online, they still need hardware to be stored and run. In the case of generative AI, this hardware is usually stored in data centers. Data centers require resources, such as the raw materials to create the hardware, the water to cool these large systems, and contribute significantly to global energy consumption (which largely requires the burning of fossil fuels). As a result, many are concerned about the environmental impacts of AI tools as more people use them as casually as Google. By running your AI tools locally, you are lowering the environmental impact of using AI as you are not contributing to the use of data centers and your device limits your energy consumption.
Offline access: Are you in a remote area with spotty internet or dealing with power outages? Then no problem! By using local AI tools, you can use generative AI without the need for internet, which ensures uninterrupted access.
Consistency of output: Cloud-based models are frequently updated, which can disrupt workflows and research that relies on reproducibility. Local setups provide stability by allowing you to use the same model version every time and choose when you download the updated model.
In my exploration of this topic, I have used four different applications to run generative AI locally on my computer: Ollama, Jan, GPT4ALL, and LM Studio. But for this blog post, I have chosen to feature GPT4ALL from Nomic AI for the following reasons:
Is an open-source software,
Emphasizes privacy,
Can interact with your local documents,
Quick to install (10-15 minutes),
Easy to use and is virtually “plug-and-play”,
Easy to customize the System Message of the model, which tells the model how to behave and interpret the conversation.
Get Started with GPT4ALL
The following is a step-by-step guide on downloading, installing, and using GPT4ALL. Disclaimer that I am a Mac user, so this guide shows the process using MacOS.
If you’d like to skip over the installation steps, go to the sectionUse GPT4ALL.
Download and install GPT4ALL
1. Download GPT4ALL
First, go to https://www.nomic.ai/gpt4all to download GPT4ALL. You do this by selecting the correct operating system for your device (macOS, Windows, Windows ARM, or Ubuntu) from the dropdown menu, and clicking the “Download” button.
2. Open installer
Once downloaded, go to your “Downloads” folder and open the DMG file. Then click to open the GPT4ALL installer.
3. Navigate GPT4ALL installer
Once opened, the GPT4ALL Installer Setup window will then pop up. You will have to navigate through several standard windows such as choosing an installation folder (Applications is default), selecting components to install, and accepting the license by clicking “Next”.
Once you’ve accepted the license, and clicked “Next”, the installation will begin. Once everything has finished downloading, click “Install”.
Use GPT4ALL
1. Open GPT4ALL
Once installed, you can navigate to where you have stored the application and open it. I have chosen to keep the default and stored the application in my Application folder.
Once you open the app, it will give you the welcome pop-up, detailing the latest release notes and allowing you to opt-in to share anonymous usage analytics or sharing of chats.
After making your choices, you are taken to the homepage.
2. Download your first model
Before we start chatting, we first need to download a model to chat with. You can do this by clicking the “Find Models” button on the homepage. This will take you to the “Explore Models” page.
The simplest way is to choose a model from the list of those specifically configured for use in GPT4ALL. However, you can also use models from Remote Providers or from HuggingFace, though both these options are potentially more complicated and may require additional configuration.
When picking a model from the GPT4ALL repository, you can see the name of the model, some information about it, and some specifications like the file size, RAM requirements, and number of parameters.
Once you’ve selected a model, click the “Download” button of that model and it will start downloading.
You can view your Installed Models by clicking the “Models” button on the left sidebar.
3. Start chatting
Now that you’ve installed a model, you can start chatting with it!
First, click on the “Chats” button on the left sidebar and it will open to a new chat. Second, you have to Choose a model to chat with, either by clicking “Choose a model” at the top of the page and selecting a model or by clicking “Load (default)”. Loading a model will take a few seconds.
Now that your model is loaded, you can begin chatting with it.
3a. Using LocalDocs
A great feature of GPT4ALL is its ability to interact with documents that you upload. As mentioned before, one advantage of running AI locally is that there are fewer risks to your privacy and this extends to the documents.
To add documents to LocalDocs, either click the “LocalDocs” button in the left sidebar or click the “LocalDocs” button in the top right corner of your chat and then click “+ Add Docs”.
You will be taken to the Add Document Collection page. You can name the collection and then upload a folder with the documents you’d like to use by clicking “Browse” and selecting your desired folder.
After you’ve selected your folder, click “Create Collection” and your files will start embedding.
Once embedded, you can go back to your Chats, click “LocalDocs” in the top right corner, and then select the Document Collection you’d like to use in this chat. We only have one Document Collection but you can use multiple in one chat.
Then, you can ask questions about the content in the documents, ask for summaries, and much more. By default, the model will cite the sources it retrieved information from.
As we continue exploring this generative AI project, ethical considerations are at the forefront of our minds. As Robynne posited in her first post, how do we define ethical generative AI? What factors do we need to be aware of to ensure that this tool aligns with our values? With this intention, I delved into the complicated world of ethics surrounding AI. From my research, I identified 10 common ethical considerations related to the boom in generative AI and one of those considerations is environmental impact.
It is difficult to quantify the environmental impact of generative AI. It extends to the mining of raw materials used to make the hardware in data centres or the water used to cool these data centres. But the most obvious resource used is energy. Like any other AI, generative AI uses a large amount of energy in its training stage. According to a 2023 study by Alex de Vries, GPT-3’s training process alone consumed an estimated 1,287 MWh of electricity, which is nearly equivalent to the annual energy consumption of 120 average American households in 2022. But energy consumption doesn’t end after the training phase. Each time someone prompts one of these LLMs such as ChatGPT, the hardware that processes and performs these operations consumes energy, estimated to be at least five times more than a normal web search. With the popularity of LLMs such as ChatGPT as well as generative AI being added into seemingly every application and technology, the number of users is only growing.
With that in mind, what do we do to mitigate the environmental impact of our AI Study Tool? The general answer is to ensure that our tool is energy efficient, and we are currently exploring three ways to do this.
Using smaller language models: Given that large language models (LLM) such as ChatGPT consume lots of energy both in training and after release, an obvious way to reduce energy consumption is to use smaller language models. A small language model (SLM) is distinguished from a large language model by the number of parameters, of which the SLM has fewer than the LLM. As an SLM is trained with a smaller dataset, it has fewer parameters and any language model with fewer than 30 billion parameters is considered an SLM. This means that an SLM is more energy-efficient, less costly to train (both in time and energy), and has improved latency. An SLM also improves upon the issues of bias and transparency as, with a smaller dataset, you have more knowledge and control of what goes into training your language model. We are unsure if our use case will allow us to use an SLM, but we are researching existing, open-source models in hopes that we will be able to fine-tune an SLM for our purposes.
Cache-augmented generation (CAG): The most common way to try to ensure that information is accurate is to check valid sources before providing an answer and the way language models usually do this is Retrieval-Augmented Generation or RAG. The idea is that, after receiving a prompt, the language model will then search for information about the query to ensure the information is both accurate and up-to-date before generating an output using the information it has fetched. This step is important to provide accurate information to users and limit the “hallucinations” we are all warned about. But this means that, on top of processing a prompt and generating a response, we now have the added cost of searching for and processing sources every time it is prompted. Enter Cache-Augmented Generation or CAG! Instead of searching through a large database (which could be the entire internet) of information, it is pre-loaded with reference data so that the search for reference information is more efficient. CAG is good for information that does not change frequently, such as one of our textbooks, and can also ensure the accuracy and validity of the information cited, so it seems perfect for our use case.
Caching generated output: Judicious AI Use to Improve Existing OER by Royce Kimmons, George Veletsianos, and Torrey Trust suggests using caching to improve the energy efficiency of the language model. As we discussed, each prompt uses some arbitrary amount of energy and that remains the case even if it’s the same prompt over and over. So, by caching or storing some of these generated responses from the learning model and returning them when a query is repeated, it will use less energy as it does not have to process and generate a new response each time. Further, the authors suggest serving those cached responses to students as OER, which reduces the number of prompts altogether and contributes to improving the equity of generative AI.
Though I’m a little overwhelmed by the sheer volume of information out there regarding energy efficiency in AI, much less the complex subject of AI ethics in general, I am excited about exploring these three solutions. As we start working with the developers for this project, I am interested in learning about the implementation of these concepts and the feasibility of implementation in our AI Study Tool. Researching this project and ensuring it aligns with our values feels like a puzzle I’m trying to solve, and I am enjoying delving into the world of computer science once again and flexing those problem-solving muscles.
With the Hewlett funding now in place, we are busy at BCcampus scoping out a project plan for the Open GenAI project. The team that will be supporting this work will be myself – Robynne Devine, Senior Project Manager, PMO, Harper Friedman, Coordinator Open Textbook Publishing, and Kaitlyn Zheng, Coordinator, Project Support and Open Publication, with Clint Lalonde being our Project Champion. I will be leaning on Clint a lot at the beginning given his deep understanding and vision for this project. Harper and Kaitlyn are already busy doing some preliminary research. I have been focusing on project documentation and doing as much learning as I can on Gen AI.
The goal of this project will be to explore and experiment with a variety of open-source Gen AI tools and technologies that align with open education values, including focusing on issues of equity, accessibility, and inclusivity while also offering offline access, reducing environmental impact, and ensuring student privacy. BCcampus will work with partners in the develop of an ethically focused AI powered study tool prototype to compliment the BCcampus open textbook collection.
WOW, ok that sounds like a lot, and it is. There are so many questions floating around in my brain right now.
What defines an equitable open Gen AI tool or technology?
Is it possible to have a fully accessible open GenAI tool?
What is defines an ethical AI powered study tool prototype?
How can we build this to ensure its improving learning and not taking away from the learning?
How do we define ethical?
I lean towards an AI study tool that asks students questions to help guide them to deepen their learning, rather than providing answers or summaries, but that is just me and it is early in the project. But I am trying to enter into this project being aware of the biases (good and bad) around GenAI. Student safety and improved student experiences are top of mind for this work.