course: Generative AI with Large Language Models
This course is presented by Andrew Ng as the introductory speaker and mainly consists of lectures by AWS engineers who explain various current theories and three practical courses.
Week 1: introduction of LLMs and project lifecycle
The course introduces the architecture of Large Language Models (LLMs), their conceptual framework, and methods of expansion. It includes practical sessions demonstrating how to load AI and use extension datasets. The lectures cover the initial pre-training phase of AI and discuss the impact of domains on both the overall system and specialized aspects.
Week 2: Fine-Tuning
The course explains how AI, after initial training, can be further expanded to enhance areas required by users. It introduces examples of Fine-Tuning and includes a practical session demonstrating the extension training and integration of AI.
Week 3: Reinforcement learning from human feedback (RLHF)
The course discusses the social responsibility of AI, including the theory of training AI to remove harmful data, a process that requires human evaluation. It also offers practical lessons as demonstrations. The final part covers the architecture of integrating AI with professional software and the related services provided by AWS.
After completing this course, I found it to be both practical and interesting. Although I didn't pay for the course and therefore couldn't use AWS's environment for the practical lessons.
However, my previous experience using Stable Diffusion allowed me to experiment in Google Colab based on the walkthrough explanations from the practical lessons. Also, drawing from my experience with stabilityai, I created a convenient test Web-App using Flask.
After reviewing the datasets used in the practical lessons, I truly understood the amount of effort the development team put into training AI for conversations. I also feel regret for the early AI models that were misled by a group of internet users. For the average user, extending the training of an existing AI is much more important than creating a new one. Building the necessary functionalities upon an already conversational AI is the approach that can significantly broaden the application scope of AI. Training a basic conversational AI is an arduous process that requires considerable resources and time. If its applications can be expanded, the costs can gradually be distributed, or the income from application-level usage can be reinvested to strengthen and rebuild the foundational conversational AI.
From this course, I unexpectedly learned that conversational AI is not inherently proficient in the computational functions natural to machines. Currently, many enterprise processes are digitized, and for conversational AI to execute calculations or pre-designed business processes, it needs to integrate these functions as external services. From my years of experience in designing game systems and distributed systems, I originally thought this would be straightforward. Creating a universal tool that does everything is challenging, but by loading the necessary functions as required by users, a powerful system can be composed of many small, specialized features. However, it seems that for conversational AI to discern and utilize professional tools, it is not as simple as users operating a process-based system.
From the course, I learned that AI can be enhanced through simple additive methods as well as more resource-intensive rebuilding techniques. Recent studies indicate that AI expanded using the LoRA method can be deconstructed through reverse engineering algorithms. Google is focusing on developing smaller models, AWS is working towards increasing computational efficiency, and OpenAI is aiming to develop more powerful models. AI has lowered the barrier for humans to use technology and has made many ancillary tasks, which previously required significant time and money but weren't the main focus, easily accomplishable. This allows attention to be focused on the primary tasks.
Comments
Post a Comment