Tuesday, October 22, 2024

Exploring NotebookLM: A New Frontier in AI-Assisted Research


One of the most exciting innovations in the AI space I found recently is NotebookLM. This tool allows users to upload various types of content—PDFs, websites, YouTube videos, audio files, Google Docs, or Google Slides—and receive detailed summaries. But that’s not all. Powered by the multimodal capabilities of Gemini 1.5, NotebookLM goes even further, uncovering insightful connections between topics and helping users gain a deeper understanding of their material.

Curious to explore its potential? You can find more information here.

As an example, I recently uploaded some of my blog posts to NotebookLM, and it generated a fantastic 12-minute podcast, showcasing its ability to distill key ideas into engaging audio content.






Friday, September 6, 2024

From Korea to the World: My Journey of Self-Discovery and Growth


As a professional from Korea, I never imagined that my career would take me across borders, from Korea to Singapore and eventually the United States. The experiences I’ve had along the way have not only shaped my professional life but also helped me grow as an individual. In this post, I’ll share my journey and the valuable lessons I’ve learned, with the hope that it may inspire you to aim high, be kind, and always strive to do your best.


Growing Up in Korea: Curiosity Beyond Borders

From a young age, I was fascinated by cultures and lifestyles beyond Korea. I was curious about different languages and ways of thinking, but as a non-English speaker, I thought my chances of exploring the world were limited. However, with determination and the desire to grow, I began learning English and gradually built the confidence I needed to pursue my dreams.


Beginning My Career at IBM

My professional journey began at IBM Korea, where I had the opportunity to work on a wide range of projects, focusing on emerging technologies like AI, Hybrid Cloud, IoT, and text analytics. Over time, my expertise grew, and I was entrusted with leading a technical team in these areas. Eventually, I was honored with the title of IBM Distinguished Engineer and received the prestigious Best of IBM award in both 2013 and 2019.

Being selected for IBM’s fast-track leadership program for high-potential talent was a pivotal moment in my career. It allowed me to collaborate closely with senior leaders, including the Country General Manager in Korea, and support executives in the Asia-Pacific region. This experience broadened my perspective on leadership and global collaboration.


Expanding My Horizons: Global Experience

One of the most enriching aspects of my career has been the chance to work across regions and with diverse global teams. Based in Singapore, I led teams throughout the APAC and Japan regions. Later, I transitioned to the U.S. to lead the Worldwide Architecture Guild within IBM Consulting Hybrid Cloud & Services Solutioning.

In addition to managing teams, I spearheaded several impactful projects on emerging technology areas. I was an early advocate for location-awareness services and IoT solutions, and I pushed for the adoption of advanced technologies like AI (Watson), Cloud-native architectures, Kubernetes (k8s), and RedHat OpenShift.


Certifications and Recognition

Over the course of my career, I’ve made it a priority to stay at the forefront of industry advancements through continuous learning. I hold certifications in key areas, including Red Hat OpenShift Administration (EX280) and Application Development (EX288), Microsoft Azure Solutions Architecture (AZ-303 and AZ-304), and IBM Certified Solution Architect – Cloud v4. These certifications have been crucial in shaping my understanding of modern cloud and hybrid cloud environments.

I am also grateful to have been recognized for my work, receiving awards such as the Best of IBM in 2013 and 2019. These accolades remind me of the importance of dedication and constant improvement.


Key Lessons Along the Way

Throughout my journey, there have been several lessons that have significantly shaped both my personal and professional life:

  • Embrace Change and Take Calculated Risks: Growth happens when we step outside of our comfort zone. Embracing uncertainty and taking well-considered risks has opened doors to opportunities I never thought possible.

  • Be Open-Minded and Proactive: A global career requires adaptability. Instead of retreating in the face of new challenges, I’ve learned that approaching each situation with curiosity and a proactive mindset fosters success.

  • Kindness and Compassion Are Essential: When working in a new country or with new teams, it’s easy to feel overwhelmed. However, practicing kindness and compassion helps build strong relationships and a supportive community—qualities that are key to thriving in any environment.


Aim High, Be Kind, and Do Your Best

As I reflect on my career, the message I want to share is simple: aim high, be kind, and always do your best. Each interaction is an opportunity to grow and learn, both for yourself and those around you. It’s natural to make mistakes along the way, but the important part is how we respond and grow from them.

My journey has been one of self-discovery, resilience, and continual growth. While the road ahead will undoubtedly bring new challenges, I know that with determination, an open heart, and a willingness to learn, anything is possible. I hope my story encourages you to pursue your goals with passion and courage.

Monday, April 1, 2024

Understanding 'Instruct' and 'Chat' Models

In the realm of artificial intelligence, Large Language Models (LLMs) have emerged as veritable powerhouses, capable of processing and generating human-like text with remarkable accuracy. These models, such as OpenAI's GPT series, owe their prowess to rigorous training on colossal datasets harvested from the vast expanse of the internet.

Foundations of LLMs

At the core of LLMs lies an extensive training process that involves exposing the model to billions of words and texts sourced from diverse digital content, ranging from books and articles to websites. This foundational training, devoid of any annotations or labels, provides the model with a raw representation of language, allowing it to discern intricate patterns and meanings—a methodology known as semi-supervised training.

Versatility and Fine-Tuning

Equipped with this wealth of linguistic knowledge, base LLMs exhibit remarkable versatility, capable of performing an array of language-related tasks, including generating conversational responses and crafting content. However, to further enhance their efficacy in handling specific tasks, a process known as fine-tuning comes into play.

Fine-tuning entails subjecting the pre-trained base model to additional training on smaller, specialized datasets relevant to the desired task. Unlike the initial training phase, these datasets come with labels, featuring examples of model-generated responses aligned with specific prompts. For instance, a prompt querying "What is the capital of England?" might elicit a response like "The capital of England is London."

Understanding the Purpose

It's essential to note that the purpose of this labeled data isn't to impart factual knowledge to the model. Instead, it serves as a guide, instructing the model on the expected response format when presented with certain prompts. Fine-tuning thus adjusts the model's parameters to better align with the nuances and requirements of tasks such as question answering, all while preserving its overarching language understanding.

Introducing 'Instruct' and 'Chat' Models

Within the realm of LLMs, two distinct variants have garnered significant attention: 'Instruct' and 'Chat' models. 'Instruct' models emphasize the process of providing explicit instructions or examples to guide the model's behavior. On the other hand, 'Chat' models focus on fine-tuning for conversational tasks, enabling the model to generate contextually appropriate responses in dialogue settings.

Unleashing the Potential

With an 'Instruct' fine-tuned model at our disposal, the possibilities are endless. Prompting questions like "What is the capital of Australia?" should ideally yield responses such as "The capital of Australia is Canberra," showcasing the model's ability to comprehend and respond to specific queries with precision and accuracy.

In essence, the evolution of LLMs, coupled with fine-tuning techniques like 'Instruct' and 'Chat' models, heralds a new era of artificial intelligence—one where machines not only understand language but also engage with it in a manner akin to human interaction. As we delve deeper into this fascinating domain, the potential for innovation and discovery knows no bounds.

Wednesday, March 27, 2024

Setting Up a Multizone Resiliency Environment with IBM Cloud and Terraform

 

In this blog post, we will discuss how to set up a multizone resiliency environment using Terraform and IBM Cloud services. We will cover the steps to create a Virtual Servers on VPC with auto-scale feature. Reference architecture that we will automatically provision can be found at : https://cloud.ibm.com/docs/pattern-vpc-vsi-multizone-resiliency?topic=pattern-vpc-vsi-multizone-resiliency-web-app-multi-zone


Step 1: Clone the necessary repositories
To start, we need to clone the required repositories. We will be using the IBM Cloud Terraform module for VPC landing zone and autoscale, and the terraform-ibm-landing-zone-vsi-autoscale as the parent composite module.

% git clone git@github.ibm.com:client-solutioning/pattern-vpc-vsi-multizone-resiliency.git


Step 2: List available branches

Before we proceed, let's list all available branches in the repository.


% git branch -a


Clone the virtual server autoscale module with specific branch called "init-module"

% git clone --branch init-module https://github.com/terraform-ibm-modules/terraform-ibm-landing-zone-vsi-autoscale.git


Step 3: Update IBM Cloud Plugins


You may need to update IBM Cloud plugins such as VPC infrastructure plugin in my case.


ibmcloud plugin update vpc-infrastructure




Step 4: Create a Terraform variables file


To pass commonly used parameter values easily, we can create a Terraform variables file called terraform.tfvarsIn this file, we can store the API key, public/private SSH key, and other parameters that we will use throughout the deployment process. Otherwise, you need type those information repeatedly when you run `terraform plan` command.


vi terraform.tfvars













































Step 5: Generate SSH public and private keys


We'll generate SSH public and private keys using the "ssh-keygen" command if they haven't been generated previously. These keys are essential for connecting to IBM Cloud services.


$ ssh-keygen -t rsa -b 4096 -C "changwoo.jung@ibm.com"

$ pbcopy < ~/.ssh/id_ed25519.pub

  # Copies the contents of the id_ed25519.pub file to your clipboard


(base) changwoojung@Changwoos-MacBook-Pro .ssh % ssh-keygen -t rsa -b 4096 -C "changwoo.jung@ibm.com" 

Generating public/private rsa key pair.

Enter file in which to save the key (/Users/changwoojung/.ssh/id_rsa): /Users/changwoojung/.ssh/id_rsa_ibmcloud

Enter passphrase (empty for no passphrase): 

Enter same passphrase again: 

Your identification has been saved in /Users/changwoojung/.ssh/id_rsa_ibmcloud

Your public key has been saved in /Users/changwoojung/.ssh/id_rsa_ibmcloud.pub


You can find SSH public key at ~/.ssh/id_rsa_ibmcloud.pub and SSH private key at ~/.ssh/id_rsa_ibmcloud



After ensuring all necessary files are in place, we'll initialize Terraform using the command terraform init. Following initialization, we'll proceed to plan and apply our deployment using terraform plan and terraform apply, respectively.


Step 6: Initialize Terraform Now that we have all the necessary files, let's initialize Terraform.


% terraform init





























Step 7: Plan and apply


% terraform plan


% terraform apply


Upon successful execution, automatically provisioned resources can be viewed at IBM Cloud Resources.




Step 8: Find the hostname for web tier load balancer
Finally, we'll locate the hostname for the web tier load balancer by accessing the URL: http://4a3b1bc4-us-east.lb.appdomain.cloud/.

With these steps completed, you've successfully established a multizone resiliency environment using IBM Cloud and Terraform.




Tuesday, March 12, 2024

Enhancing Large Language Model (LLM) Development with LangChain: Simplifying Complexities and Streamlining Workflows

 


LangChain (https://github.com/langchain-ai/langchain), an open-source framework, is rapidly gaining recognition as a go-to solution for LLM application development. By offering a streamlined approach to common tasks in LLM applications, LangChain ensures that developers can write more efficient and cleaner code. It does not inherently introduce new capabilities to LLMs but rather simplifies the implementation process. One of the most significant challenges in LLM development lies in handling the intricate "orchestration" required for these models.

LangChain addresses this issue by providing a comprehensive API, enabling developers to manage various aspects more effectively: 
  • Prompt Templating: LangChain simplifies the process of creating and managing prompt templates for diverse use cases, ensuring consistency and reducing manual effort. 

  • Output Parsing: The framework offers built-in functionality for parsing LLM output, allowing developers to extract specific information with ease and accuracy. 

  • Sequence Management: LangChain streamlines the creation and management of a series of calls to multiple LLMs, enabling more efficient workflows and reducing coding overhead. 

  • Session State Maintenance: With LangChain, managing session state between individual LLM calls becomes effortless. This memory-based support ensures that context remains consistent throughout the application flow. 

  • RAG Support: LangChain provides native support for RAG (Reject, Accept, and Grace) patterns, ensuring developers have greater control over their applications' decision-making capabilities. 

This typical summarization use case requires a lot of “orchestration” and “utility” code. LangChain provides an API to simplify implementation










Expanding the Capabilities of Generative AI with Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) represents an innovative advancement in generative artificial intelligence, enabling us to leverage data "beyond the model's scope." This refers to external information that was not integrated into the model during training. By integrating RAG, we can add essential guardrails to the generated output and minimize instances of hallucination. 

RAG offers valuable applications across various generative AI use cases, such as: 

  • Question-answering systems
  • Text summarization
  • Content generation 

To better grasp RAG's functionality, consider a human interaction analogy. Imagine providing a document to an individual and requesting they generate an answer based on the information within that document.

The two primary components of RAG are: 

  1. Retrieval step: In this stage, we search through extensive knowledge bases (documents, websites, databases, etc.) to identify data relevant to the model's instructions. 

  2. Generation step: Similar to traditional generation use cases, this phase generates a response or content based on the information retrieved during the initial step. The crucial distinction lies in how the retrieved information is utilized – it serves as context and is incorporated into the prompt provided for the generative model.

In the realm of vector databases, unstructured data is transformed and stored as numeric representations, which are commonly referred to as embeddings for AI applications. Embeddings are derived by employing embedding models such as word2vec. Consider the illustrative example of a semantic search for the term "computer". Observe the closest matching words in the table below, with distances measured numerically since the search was executed utilizing vector (numeric) data. Feel free to test additional words on this platform. https://projector.tensorflow.org/











Tuesday, March 5, 2024

Harnessing the Power of Generative AI for Business

In the ever-evolving landscape of computer science, Artificial Intelligence (AI) has seen its fair share of enthusiasm and skepticism over the decades. The early 2010s brought a wave of excitement as IBM Watson's triumph on Jeopardy! against human champion Ken Jennings (2011), followed by Google's AlphaGo defeating world champion Lee Se-dol in Go around 2016, fueled the belief in AI's potential applicability beyond imagination. At that time, I didn't envision AI being utilized for general domain Q&A or problem solving. However, with the emergence of foundation models like ChatGPT in 2022, my perspective was broadened significantly. 

While businesses entertain the idea of adopting these capabilities, it is crucial to recognize that enterprise application varies greatly from personal exploration. Security concerns, model understanding, explainability, and data privacy are non-negotiable. To effectively integrate AI into Enterprise applications, it must be designed with four key aspects: Openness, Targeted Application, Trustworthiness, and Empowerment.

  • Openness: Openness highlights the significance of making AI technologies accessible to all. By sharing knowledge, research, and tools, we foster collaboration and innovation. Open AI frameworks, datasets, and algorithms encourage a broader range of individuals and organizations to contribute to AI development.

  • Targeted Application: Targeted Application: Targeted application refers to deploying AI in specific problem areas where it can yield significant positive impacts. Sectors such as healthcare, education, climate change, or poverty alleviation offer opportunities for AI optimization and tackling complex challenges. Examples include talent/HR productivity enhancement, customer service through conversational AI, and app modernization using automated code generation and transformation capabilities.

  • Trustworthiness: Trust is essential for successful AI adoption. Transparency and accountability are vital when designing and deploying AI systems. Explainable algorithms enable users to understand decision-making processes, addressing biases and ensuring fairness. Data privacy and security safeguard individuals' rights and prevent misuse.

  • Empowerment: The future of AI lies in empowering individuals rather than replacing them. AI technologies should augment human capabilities, enhancing efficiency and effectiveness. They can foster creativity, productivity, and better decision-making abilities. 

Benchmarking LLMs in the Wild

 

Have you ever wanted to compare the results of different LLM models from the same prompt? Enter https://chat.lmsys.org/, a powerful tool for comparing the performance of different LLM models on the same task. With this tool, you can easily input multiple LLM models and their corresponding outputs, and then view a comparison of their results side-by-side. You can easily compare the performance of different LLM models on the same task, and gain valuable insights into their strengths and weaknesses. 

Here is the example that I used the same prompt from my previous post.


Explain me the following command.

podman run -d -p 3000:8080 --network slirp4netns:allow_host_loopback=true -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main 



Happy modeling!