Skip to main content

Reflections on course 'Generative AI with Large Language Models'

 course: Generative AI with Large Language Models

This course is presented by Andrew Ng as the introductory speaker and mainly consists of lectures by AWS engineers who explain various current theories and three practical courses.


Week 1: introduction of LLMs and project lifecycle 

The course introduces the architecture of Large Language Models (LLMs), their conceptual framework, and methods of expansion. It includes practical sessions demonstrating how to load AI and use extension datasets. The lectures cover the initial pre-training phase of AI and discuss the impact of domains on both the overall system and specialized aspects.

Week 2: Fine-Tuning

The course explains how AI, after initial training, can be further expanded to enhance areas required by users. It introduces examples of Fine-Tuning and includes a practical session demonstrating the extension training and integration of AI.

Week 3: Reinforcement learning from human feedback (RLHF) 

The course discusses the social responsibility of AI, including the theory of training AI to remove harmful data, a process that requires human evaluation. It also offers practical lessons as demonstrations. The final part covers the architecture of integrating AI with professional software and the related services provided by AWS.


After completing this course, I found it to be both practical and interesting. Although I didn't pay for the course and therefore couldn't use AWS's environment for the practical lessons.

However, my previous experience using Stable Diffusion allowed me to experiment in Google Colab based on the walkthrough explanations from the practical lessons. Also, drawing from my experience with stabilityai, I created a convenient test Web-App using Flask.


After reviewing the datasets used in the practical lessons, I truly understood the amount of effort the development team put into training AI for conversations. I also feel regret for the early AI models that were misled by a group of internet users. For the average user, extending the training of an existing AI is much more important than creating a new one. Building the necessary functionalities upon an already conversational AI is the approach that can significantly broaden the application scope of AI. Training a basic conversational AI is an arduous process that requires considerable resources and time. If its applications can be expanded, the costs can gradually be distributed, or the income from application-level usage can be reinvested to strengthen and rebuild the foundational conversational AI.

From this course, I unexpectedly learned that conversational AI is not inherently proficient in the computational functions natural to machines. Currently, many enterprise processes are digitized, and for conversational AI to execute calculations or pre-designed business processes, it needs to integrate these functions as external services. From my years of experience in designing game systems and distributed systems, I originally thought this would be straightforward. Creating a universal tool that does everything is challenging, but by loading the necessary functions as required by users, a powerful system can be composed of many small, specialized features. However, it seems that for conversational AI to discern and utilize professional tools, it is not as simple as users operating a process-based system.

From the course, I learned that AI can be enhanced through simple additive methods as well as more resource-intensive rebuilding techniques. Recent studies indicate that AI expanded using the LoRA method can be deconstructed through reverse engineering algorithms. Google is focusing on developing smaller models, AWS is working towards increasing computational efficiency, and OpenAI is aiming to develop more powerful models. AI has lowered the barrier for humans to use technology and has made many ancillary tasks, which previously required significant time and money but weren't the main focus, easily accomplishable. This allows attention to be focused on the primary tasks.


Comments

Popular posts from this blog

Bookmark service (MongoDB & Spring REST) -2/2

    I accidentally deleted my development VM. I got lucky having the habit of taking notes. This blog is useful. Development VM is doom to be lost. Rebuild it waste time, but having a clean slate is refreshing~. What concerns me more is my AWS free quota this month is reaching 85%. The second VM I launched but never being used might be the one to blame. (Of course, my mistake.) I terminated the wrong VM. Now I got Linux 2 built. Great, just threw away everything happened on AMI.  1st layer: Page Page class   Originally, I need to prepare getter/setter for all class properties for Spring. By using lombok, I only need to create constructors. lombok will deal with getter/setter and toString(). But there are chances to call getter/setter, but how? .......Naming convention.... Capitalize the 1st character with the prefix get/set.  Annotation @Data was used on this class.  Repository class Spring Docs: Repository https://docs.spring.io/spring-data/mongodb/docs/3....

gamer's interview

This project simulates a gamer's interview. Based on NodeJS+ ReactJS The setting is interviewing a gamer/journalist what's his/her plan of March 2020? The gamer answers his/her game list in plan, how many reviews on demand and how many hours expected. Games selected for review take 5 hours for each, while others take one. This project is designed to practice render html, jsx, component, props introduced in  https://www.w3schools.com/REACT/default.asp . Also fixed other issues to make it work. When trying to modualize objects and tools, my design developed to separate views and processes. And it is quite similar to the initialized structure NodeJS+ReactJS provided. Furthermore, since include local module files are banned by browsers, use NodeJS service seems to be the best option. view file main process object tool

Comments for my Server/Client Web API samples

        Finally, I finished the comments for python/07 and 09 projects. I almost forgot to put the date on source code which is used to note how long it took me. Not precisely in hours….. I didn’t include source code in my previous post. If choosing code-section for this post…… maybe I want to mark out my comment….. (Really?!)          Once my work was developing websites for enterprises, including ERP, CRM or content sites. The sustainability of network and security are important issues. There are 2 methods for HTML Form submission: GET and POST. Submit via POST is secure, compared to GET which piles parameters on URL. RESTful API is mainly using GET.         Yup, even if you have a certification key, if you put the value on the URL, it is visible data. When writing socket-communication, client-server sockets are a pair; both follow the agreement on commands and structures; and there are countless ports for usa...