Skip to main content

Reflections on course 'Generative AI with Large Language Models'

 course: Generative AI with Large Language Models

This course is presented by Andrew Ng as the introductory speaker and mainly consists of lectures by AWS engineers who explain various current theories and three practical courses.


Week 1: introduction of LLMs and project lifecycle 

The course introduces the architecture of Large Language Models (LLMs), their conceptual framework, and methods of expansion. It includes practical sessions demonstrating how to load AI and use extension datasets. The lectures cover the initial pre-training phase of AI and discuss the impact of domains on both the overall system and specialized aspects.

Week 2: Fine-Tuning

The course explains how AI, after initial training, can be further expanded to enhance areas required by users. It introduces examples of Fine-Tuning and includes a practical session demonstrating the extension training and integration of AI.

Week 3: Reinforcement learning from human feedback (RLHF) 

The course discusses the social responsibility of AI, including the theory of training AI to remove harmful data, a process that requires human evaluation. It also offers practical lessons as demonstrations. The final part covers the architecture of integrating AI with professional software and the related services provided by AWS.


After completing this course, I found it to be both practical and interesting. Although I didn't pay for the course and therefore couldn't use AWS's environment for the practical lessons.

However, my previous experience using Stable Diffusion allowed me to experiment in Google Colab based on the walkthrough explanations from the practical lessons. Also, drawing from my experience with stabilityai, I created a convenient test Web-App using Flask.


After reviewing the datasets used in the practical lessons, I truly understood the amount of effort the development team put into training AI for conversations. I also feel regret for the early AI models that were misled by a group of internet users. For the average user, extending the training of an existing AI is much more important than creating a new one. Building the necessary functionalities upon an already conversational AI is the approach that can significantly broaden the application scope of AI. Training a basic conversational AI is an arduous process that requires considerable resources and time. If its applications can be expanded, the costs can gradually be distributed, or the income from application-level usage can be reinvested to strengthen and rebuild the foundational conversational AI.

From this course, I unexpectedly learned that conversational AI is not inherently proficient in the computational functions natural to machines. Currently, many enterprise processes are digitized, and for conversational AI to execute calculations or pre-designed business processes, it needs to integrate these functions as external services. From my years of experience in designing game systems and distributed systems, I originally thought this would be straightforward. Creating a universal tool that does everything is challenging, but by loading the necessary functions as required by users, a powerful system can be composed of many small, specialized features. However, it seems that for conversational AI to discern and utilize professional tools, it is not as simple as users operating a process-based system.

From the course, I learned that AI can be enhanced through simple additive methods as well as more resource-intensive rebuilding techniques. Recent studies indicate that AI expanded using the LoRA method can be deconstructed through reverse engineering algorithms. Google is focusing on developing smaller models, AWS is working towards increasing computational efficiency, and OpenAI is aiming to develop more powerful models. AI has lowered the barrier for humans to use technology and has made many ancillary tasks, which previously required significant time and money but weren't the main focus, easily accomplishable. This allows attention to be focused on the primary tasks.


Comments

Popular posts from this blog

Bookmark service (MongoDB & Spring REST) -2/2

    I accidentally deleted my development VM. I got lucky having the habit of taking notes. This blog is useful. Development VM is doom to be lost. Rebuild it waste time, but having a clean slate is refreshing~. What concerns me more is my AWS free quota this month is reaching 85%. The second VM I launched but never being used might be the one to blame. (Of course, my mistake.) I terminated the wrong VM. Now I got Linux 2 built. Great, just threw away everything happened on AMI.  1st layer: Page Page class   Originally, I need to prepare getter/setter for all class properties for Spring. By using lombok, I only need to create constructors. lombok will deal with getter/setter and toString(). But there are chances to call getter/setter, but how? .......Naming convention.... Capitalize the 1st character with the prefix get/set.  Annotation @Data was used on this class.  Repository class Spring Docs: Repository https://docs.spring.io/spring-data/mongodb/docs/3....

Guide to Preserving HuggingFace Models in Google Colab Environments

Conclusion:  Step 1:  find the model path: ls ~/.cache  Step 2:  Copy the entire folder to Google Drive:  Step 3:  Set model path to the subfolder under snapshot: My Story: I initially began exploring Generative AI (GAI) and Google Colab through Stable Diffusion. In the past, as I mainly wrote server services and console applications, I was less familiar with data science modes like R and Jupyter that can maintain a paused state. I didn't quite understand the heavy burden on Colab of creating a temporary Stable Diffusion WebUI with .ipynb, as suggested by popular guides. I just found it troublesome that connections often took a long time and then dropped, requiring a restart. Recently, while testing new versions of the Stable Diffusion model, and facing challenges due to Colab's policies making various versions of WebUI difficult to run successfully, I started researching how to write my own test programs in Colab. Eventually, I understood that Colab is ess...

Setup Maven and two basic projects

    The interesting implementation of Java I proceed to is Spring. And, only getting Java running is not enough. I also need to set up Maven. This name is new to me, and I found many Spring tutorials just skip this part. At least I need Maven to generate templates for me. I should learn it more. ( I knew there is a great tool -- Eclipse -- can make tedious things disappear. I’m taking a strategy to install all experiments I want to try and throw away when it's full. And that's an external VM, not my PC. I, not yet, want to do research about installing Eclipse on AMI. )    Upgrade to Java 8     First is to upgrade Java on AMI to Java8. AWS provides advanced tools for Linux 2. And DIY for Linux2. At least there are solutions for my choice.  Amazon Corretto 8 Installation Instructions for Amazon Linux 2 https://docs.aws.amazon.com/corretto/latest/corretto-8-ug/amazon-linux-install.html Here are commands I used: >wget */amazon-corretto-8-x64-linux-jdk.d...