Visual ChatGPT

Visual ChatGPT is a version of the ChatGPT language model that has been trained to generate text-based responses to visual prompts. It uses computer vision algorithms to analyze and interpret images, and then generates text based on the content of the image.

To get started with Visual ChatGPT, you will need to provide it with an image that you want to generate a response for. This can be done by either uploading an image file or providing a URL to an image hosted online.

Once the image has been processed by the computer vision algorithms, Visual ChatGPT will generate a response in natural language based on the content of the image. This response can be used to answer questions, provide information, or generate captions for images.

To use Visual ChatGPT, you can interact with it through a chat interface, similar to how you would interact with a human chatbot. Simply provide the image and wait for Visual ChatGPT to generate a response. You can also provide additional context or information to help guide the response generated by the model.

Enable Visual ChatGPT on your Windows Machine

Folow the steps:

# clone the repo
git clone https://github.com/microsoft/visual-chatgpt.git

# Go to directory
cd visual-chatgpt

# download Ananconda & create a new environment
https://www.anaconda.com/products/distribution

conda create -n visgpt python=3.8

# activate the new environment
conda activate visgpt

#  prepare the basic environments
pip install -r requirements.txt

# Generate an API Key/Secret Key from your OpenAi.com 
(see image below)

# set your private OpenAI key (for Windows)
set OPENAI_API_KEY={Your_Private_Openai_Key}
set OPENAI_API_KEY=sk-...

# Start Visual ChatGPT !
# You can specify the GPU/CPU assignment by "--load", the parameter indicates which 
# Visual Foundation Model to use and where it will be loaded to
# The model and device are separated by underline '_', the different models are separated by comma ','
# The available Visual Foundation Models can be found in the following table
# For example, if you want to load ImageCaptioning to cpu and Text2Image to cuda:0
# You can use: "ImageCaptioning_cpu,Text2Image_cuda:0"

# Advice for CPU Users
python visual_chatgpt.py --load ImageCaptioning_cpu,Text2Image_cpu

# Advice for 1 Tesla T4 15GB  (Google Colab)                       
python visual_chatgpt.py --load "ImageCaptioning_cuda:0,Text2Image_cuda:0"
                                
# Advice for 4 Tesla V100 32GB                            
python visual_chatgpt.py --load "ImageCaptioning_cuda:0,ImageEditing_cuda:0,
    Text2Image_cuda:1,Image2Canny_cpu,CannyText2Image_cuda:1,
    Image2Depth_cpu,DepthText2Image_cuda:1,VisualQuestionAnswering_cuda:2,
    InstructPix2Pix_cuda:2,Image2Scribble_cpu,ScribbleText2Image_cuda:2,
    Image2Seg_cpu,SegText2Image_cuda:2,Image2Pose_cpu,PoseText2Image_cuda:2,
    Image2Hed_cpu,HedText2Image_cuda:3,Image2Normal_cpu,
    NormalText2Image_cuda:3,Image2Line_cpu,LineText2Image_cuda:3"
OpenAI – API Keys

Demo

The Visual ChatGPT demo should now be running on your local machine. You can open a web browser and navigate to http://localhost:7868 to interact with the model.

It’s important to note that while Visual ChatGPT can generate responses based on visual prompts, it is not perfect and may sometimes generate inaccurate or inappropriate responses. As with any AI model, it is important to use it responsibly and carefully evaluate the accuracy of its responses.

Cheers!

Reference: https://github.com/microsoft/visual-chatgpt

Convert Microsoft Office Documents into PDF using Microsoft Graph & Azure Functions

One of my recent task was to translate a word document into a pdf from a web application hosted in Azure Web App – therefore the code has to process at the Azure side.

Initially, I thought a traditional approach would work such as Interop or some free Api easily available on the net, however to my great surprise the code was throwing the following exception A generic error occurred in GDI+.

Exception handling in Net: Advanced exceptions | Hexacta

What I learned from this is that all Azure Web Apps (as well as Mobile App/Services, WebJobs, and Functions) run in a secure environment called a sandbox. Each app runs inside its own sandbox, isolating its execution from other instances on the same machine as well as providing an additional degree of security and privacy that would otherwise not be available. The sandbox mechanism aims to ensure that each app running on a machine will have a minimum guaranteed level of service; furthermore, the runtime limits enforced by the sandbox protect apps from being adversely affected by other resource-intensive apps which may be running on the same machine.

The sandbox generally aims to restrict access to shared components of Windows. Unfortunately, many core components of Windows have been designed as shared components: the registry, cryptography, and graphics subsystems, among others. For the sake of radical attack surface area reduction, the sandbox prevents almost all of the Win32k.sys APIs from being called, which practically means that most of User32/GDI32 system calls are blocked. For most applications, this is not an issue since most Azure Web Apps do not require access to Windows UI functionality (they are web applications after all). Since all the major libraries use a lot of GDI calls during the PDF conversion, the default rendering engine does not work on Azure Web Apps. You can find more information about those sandbox restrictions on https://github.com/projectkudu/kudu/wiki/Azure-Web-App-sandbox#win32ksys-user32gdi32-restrictions.

So now the solution is to find an approach to convert the PDF within Azure – luckily I came across a blog from Philipp Bauknecht which is leveraging Microsoft Graph to convert a document to PDF – let us see how.

There are several steps, which you have to perform in the correct order:

  1. Create an App registration in Azure AD and assign the required permissions
  2. Create a new Azure Functions app using Visual Studio 2019
  3. Create an OAuth2 authentication service to request an access token to call the Microsoft Graph
  4. Create a File Service to upload, convert and delete files using the Microsoft Graph
  5. Setup Dependency Injection
  6. Create a new function as the Main entry point
  7. Create a Function App in Azure to host the code and make it available globally
  8. Import the publish profile & deploy using Visual Studio 2019
  9. Test using a Console Application c#
  10. Test using Postman

Step 1: Create an App registration in Azure AD and assign the required permissions

1.1 Go to https://portal.azure.com, then Azure Active Directory and select App Registrations; Click on New registration, provide a name then click on Register

1.2 Once the app is provisioned, on the left navigation blade click on Certificates & secrets; Click on New client secret to create one, then save the value of the secret for later use.

1.3 Go to API permissions, then click on Add a permission then Microsoft Graph, then choose Application permissions to add the following permissions (Admin consent is a must):

1.4 Go to Overview and save the values of Application (client) Id and Directory (tenant) Id for later use.

Step 2: Create a new Azure Functions app using Visual Studio 2019

Open Visual Studio 2019 and Create a new project in which choose Azure Functions

Step 3: Create an OAuth2 authentication service to request an access token to call the Microsoft Graph

This class is responsible to get the access token.

using Microsoft.Extensions.Options;
using Newtonsoft.Json;
using System.Collections.Generic;
using System.Net.Http;
using System.Threading.Tasks;

namespace PdfConversionFunctionApp
{
    public class AuthenticationService
    {
        public static async Task<string> GetAccessTokenAsync(ApiConfig _apiConfig)
        {
            var values = new List<KeyValuePair<string, string>>
            {
                new KeyValuePair<string, string>("client_id", _apiConfig.ClientId),
                new KeyValuePair<string, string>("client_secret", _apiConfig.ClientSecret),
                new KeyValuePair<string, string>("scope", _apiConfig.Scope),
                new KeyValuePair<string, string>("grant_type", _apiConfig.GrantType),
                new KeyValuePair<string, string>("resource", _apiConfig.Resource)
            };
            var client = new HttpClient();
            var requestUrl = $"{_apiConfig.Endpoint}{_apiConfig.TenantId}/oauth2/token";
            var requestContent = new FormUrlEncodedContent(values);
            var response = await client.PostAsync(requestUrl, requestContent);
            var responseBody = await response.Content.ReadAsStringAsync();
            dynamic tokenResponse = JsonConvert.DeserializeObject(responseBody);
            return tokenResponse?.access_token;
        }
    }
}

Step 4: Create a File Service to upload, convert and delete files using the Microsoft Graph

This class is responsible to upload, convert and delete the file.

using Newtonsoft.Json;
using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading.Tasks;

namespace PdfConversionFunctionApp
{
    public class FileService
    {
        private readonly ApiConfig _apiConfig;
        private HttpClient _httpClient;

        public FileService(ApiConfig apiConfig)
        {
            _apiConfig = apiConfig;
        }

        private async Task<HttpClient> CreateAuthorizedHttpClient()
        {
            if (_httpClient != null)
            {
                return _httpClient;
            }

            var token = await AuthenticationService.GetAccessTokenAsync(_apiConfig); 
            _httpClient = new HttpClient();
            _httpClient.DefaultRequestHeaders.Add("Authorization", $"Bearer {token}");

            return _httpClient;
        }

        public async Task<string> UploadStreamAsync(string path, Stream content, string contentType)
        {
            var httpClient = await CreateAuthorizedHttpClient();

            string tmpFileName = $"{Guid.NewGuid().ToString()}{MimeTypes.MimeTypeMap.GetExtension(contentType)}";
            string requestUrl = $"{path}root:/{tmpFileName}:/content";
            var requestContent = new StreamContent(content);
            requestContent.Headers.ContentType = new MediaTypeHeaderValue(contentType);
            var response = await httpClient.PutAsync(requestUrl, requestContent);
            if (response.IsSuccessStatusCode)
            {
                dynamic file = JsonConvert.DeserializeObject(await response.Content.ReadAsStringAsync());
                return file?.id;
            }
            else
            {
                var message = await response.Content.ReadAsStringAsync();
                throw new Exception($"Upload file failed with status {response.StatusCode} and message {message}");
            }
        }

        public async Task<byte[]> DownloadConvertedFileAsync(string path, string fileId, string targetFormat)
        {
            var httpClient = await CreateAuthorizedHttpClient();

            var requestUrl = $"{path}{fileId}/content?format={targetFormat}";
            var response = await httpClient.GetAsync(requestUrl);
            if (response.IsSuccessStatusCode)
            {
                var fileContent = await response.Content.ReadAsByteArrayAsync();
                return fileContent;
            }
            else
            {
                var message = await response.Content.ReadAsStringAsync();
                throw new Exception($"Download of converted file failed with status {response.StatusCode} and message {message}");
            }
        }

        public async Task DeleteFileAsync(string path, string fileId)
        {
            var httpClient = await CreateAuthorizedHttpClient();

            var requestUrl = $"{path}{fileId}";
            var response = await httpClient.DeleteAsync(requestUrl);
            if (!response.IsSuccessStatusCode)
            {
                var message = await response.Content.ReadAsStringAsync();
                throw new Exception($"Delete file failed with status {response.StatusCode} and message {message}");
            }
        }
    }
}

Step 5: Setup Dependency Injection

5.1 In order to use the FileService and the Configuration properties (local & in Azure), we need to set dependency injection. To use dependency injection in Azure Function app we need to add the package Microsoft.Azure.Functions.Extensions to our app using Nuget.

using Microsoft.Azure.Functions.Extensions.DependencyInjection;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using System;
using System.IO;
using System.Reflection;

[assembly: FunctionsStartup(typeof(PdfConversionFunctionApp.Startup))]
namespace PdfConversionFunctionApp
{
    class Startup : FunctionsStartup
    {
        public override void Configure(IFunctionsHostBuilder builder)
        {
            var fileInfo = new FileInfo(Assembly.GetExecutingAssembly().Location);
            string path = fileInfo.Directory.Parent.FullName;
            var config = new ConfigurationBuilder()
                .SetBasePath(Environment.CurrentDirectory)
                .SetBasePath(path)
                .AddJsonFile("local.settings.json", optional: true, reloadOnChange: true)
                .AddEnvironmentVariables()
                .Build();

            var apiConfig = new ApiConfig();
            config.Bind(nameof(ApiConfig), apiConfig);

            builder.Services.AddSingleton<FileService>();
            builder.Services.AddSingleton(apiConfig);
        }
    }
}

The above code – from line 15 to 25 – takes care of getting the configuration values, if the app runs locally then it loads the local.settings.json, otherwise, it takes the values from the Azure Function Application settings (see Step 7.2)

5.2 Now set the values of TenantId, ClientId & ClientSecret from Step 1; The SiteId correspond to the Document Library where the file will get temporarily uploaded, we will have to GET it using Microsoft Graph Explorer with the following formula:

This is how the local.settings.json looks:

https://graph.microsoft.com/v1.0/sites/{hostname}:/sites/{path}?$select=id
GET => https://graph.microsoft.com/v1.0/sites/myorganization.sharepoint.com?$select=id
GET => https://graph.microsoft.com/v1.0/sites/myorganization.sharepoint.com:/sites/Contoso/Operations/Manufacturing?$select=id
Response => 
{
    "@odata.context": "https://graph.microsoft.com/v1.0/$metadata#sites(id)/$entity",
    "id": "myorganization.sharepoint.com,74796aa9-17f6-4c09-9b20-1d78bfdcbac4,98f692fe-ea45-423b-8001-0b9c6bb2b50f"
}
What you get back in the id is in this format: {hostname},{spsite.id},{spweb.id}. 
What we need is then the {spsite.id} which is 74796aa9-17f6-4c09-9b20-1d78bfdcbac4
{
  "IsEncrypted": false,
  "Values": {
    "AzureWebJobsStorage": "UseDevelopmentStorage=true",
    "FUNCTIONS_WORKER_RUNTIME": "dotnet",
    "graph:Endpoint": "https://login.microsoftonline.com/",
    "graph:GrantType": "client_credentials",
    "graph:Scope": "Files.ReadWrite.All",
    "graph:Resource": "https://graph.microsoft.com",
    "graph:TenantId": "",
    "graph:ClientId": "",
    "graph:ClientSecret": "",
    "pdf:GraphEndpoint": "https://graph.microsoft.com/v1.0/",
    "pdf:SiteId": ""
  },
  "ApiConfig": {
    "Endpoint": "https://login.microsoftonline.com/",
    "GrantType": "client_credentials",
    "Scope": "Files.ReadWrite.All",
    "Resource": "https://graph.microsoft.com",
    "TenantId": "",
    "ClientId": "",
    "ClientSecret": "",
    "GraphEndpoint": "https://graph.microsoft.com/v1.0/",
    "SiteId": ""
  }
}

Step 6: Create a new function as the Main entry point

Add a new function to your project and name it ConvertToPdf. Select the Http trigger so our function can be called via a http request and pick Authorization level Anonymous so we don’t need to provide any credentials when calling this function; Replace the below code

using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.Extensions.Logging;
using System.Threading.Tasks;

namespace PdfConversionFunctionApp
{
    public class ConvertToPdf
    {
        private readonly FileService _fileService;
        private readonly ApiConfig _apiConfig;

        public ConvertToPdf(FileService fileService, ApiConfig apiConfig)
        {
            _fileService = fileService;
            _apiConfig = apiConfig;
        }

        [FunctionName("ConvertToPdf")]
        public async Task<IActionResult> Run(
            [HttpTrigger(AuthorizationLevel.Anonymous, "post", Route = null)] HttpRequest req, ILogger log)
        {
            if (req.Headers.ContentLength == 0)
            {
                log.LogInformation("Please provide a file.");
                return new BadRequestObjectResult("Please provide a file.");
            }

            var path = $"{_apiConfig.GraphEndpoint}sites/{_apiConfig.SiteId}/drive/items/";

            var fileId = await _fileService.UploadStreamAsync(path, req.Body, req.ContentType);

            var pdf = await _fileService.DownloadConvertedFileAsync(path, fileId, "pdf");

            await _fileService.DeleteFileAsync(path, fileId);

            return new FileContentResult(pdf, "application/pdf");
        }
    }
}

Step 7: Create a Function App in Azure to host the code and make it available globally

7.1 Go to https://portal.azure.com, then click on Create Function App

7.2 Once the app is provisioned, on the left navigation blade click on Configuration, then New application setting – we will have to add the below application settings which are needed when the app runs from Azure (the values as the same as step 5.2)

7.3 On the Overview section, download the publish profile while clicking on Get publish profile

Step 8: Import the publish profile & deploy using Visual Studio 2019

8.1 Right-click on Visual Studio, then choose Publish, import your publish settings to deploy your app from the file downloaded in the previous step – then deploy.

8.2 If Debugging is needed then we can use the Azure Function App Log Stream Monitoring features.

Step 9: Test using a Console Application c#

9.1 Create a console application and replace the following code.

using System;
using System.IO;
using System.Net;
using PdfConversionFunctionApp;

namespace PdfConversionConsoleApp
{
    class Program
    {
        static void Main(string[] args)
        {
            string filePathWord = @"C:\Temp\TestDocument.docx";
            string filePathOutWord = @"C:\Temp\TestDocument.pdf";

            string filePathExcel = @"C:\Temp\TestExcel.xlsx";
            string filePathOutExcel = @"C:\Temp\TestExcel.pdf";

            bool IsSuccessWord = ConverToPdf(filePathWord, filePathOutWord);
            bool IsSuccessExcel = ConverToPdf(filePathExcel, filePathOutExcel);
        }

        private static bool ConverToPdf(String filePath, String filePathOut)
        {
            try
            {
                //string urlLocal = "http://localhost:7071/api/ConvertToPdf";
                string urlAzure = "https://graphpdfconverter.azurewebsites.net/api/ConvertToPdf";

                HttpWebRequest req = (HttpWebRequest)WebRequest.Create(urlAzure);
                req.Method = "POST";

                string fileExtension = Path.GetExtension(filePath);
                switch (fileExtension)
                {
                    case ".doc":
                        req.ContentType = "application/msword";
                        break;
                    case ".docx":
                        req.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";
                        break;
                    case ".xls":
                        req.ContentType = "application/vnd.ms-excel";
                        break;
                    case ".xlsx":
                        req.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"; ;
                        break;
                    default:
                        throw new Exception("Only Word & Excel documents are supported by the Converter");
                }

                Stream fileStream = System.IO.File.Open(filePath, FileMode.Open);
                MemoryStream inputStream = new MemoryStream();
                fileStream.CopyTo(inputStream);
                fileStream.Dispose();
                Stream stream = req.GetRequestStream();
                stream.Write(inputStream.ToArray(), 0, inputStream.ToArray().Length);
                HttpWebResponse res = (HttpWebResponse)req.GetResponse();

                //Create file stream to save the output PDF file
                FileStream outStream = System.IO.File.Create(filePathOut);
                //Copy the responce stream into file stream
                res.GetResponseStream().CopyTo(outStream);
                //Dispose the input stream
                inputStream.Dispose();
                //Dispose the file stream
                outStream.Dispose();

                return true;
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
            return false;


        }
    }
}

9.2 To test and debug locally, Click F5 on the Function App – Visual Studio will provide a POST URL which you can use in the console to run & debug the code.

9.3 To run it from Azure, go to Azure Portal, then open your Azure Function App, on the left navigation blade click on Functions, click on the function name then Get Function Url. Use this URL in the console to convert the document to pdf.

It is important to mention that the Content Type will define the type of docunent to be converted – find the complete list of Common MIME types.

Step 10: Test using Postman

10.1 In Postman, add the Azure Function App Url (see step 9.3).

10.2 On the Header section, add the appropriate MIME Types

10.3 On the Body section, click on Binary and upload a file then click the Send button.

10.4 On successfull request, we can save the converted pdf file.

Summary

As we can see Microsoft Graph allows us to convert easily documents to pdf, that up to 1 million free calls, along with Azure Function it provides the flexibility to use these features anywhere anytime your users want.

Download the code from Github

Digital Transformation – 5 steps to success

In the year 1995, I remember helping my father identifying and classifying all information-bearing documents, whether they were in the form of hard-copy output or computer 5 1/4″ floppy drive – my dad being in the Travel Industry in Madagascar, we had to process pile of files, it was like running out of space on top of his desk and having to process one pile of files at a time in order to free up space for another pile of files – this was a long and tedious task.

To automate this task, I had developed a small application using DBASE III running on a Windows 286 with an MS-DOS Version 5 which was my first computer – the program design was quite simple, I had to enter all the customer data along with one important attribute “the File Number” – the intent was that when someone searching for a particular string, instantly the app shows the file number containing the respective data – as these efforts started to bear fruit, the happiness and satisfaction on his face were palpable; Wish he could have seen today’s digital era where humans are engaging in smarter experiences through technological innovation.

So what is Digital Transformation?

simply put, use technological innovation to convert your manual tasks or create new business processes.

According to Schumpeter, the process of technological change in a free market consists of three parts: invention (conceiving a new idea or process), innovation (arranging the economic requirements for implementing an invention), and diffusion (whereby people observing the new discovery adopt or imitate it) – in layman’s terms, this means Schumpeter argued that anyone seeking profits must innovate.

This is true for all the companies who have adopted and transformed their services or business through technology – for an instance with this COVID pandemic, today with a majority of individuals working remotely, employee experience of digital technology has gone from “nice to have” to “the only way work gets done” – this is a revolution of the introduction of a new technology that creates entirely new ways of serving existing needs and significantly disrupts an existing industry. On the other hand, companies that are not innovating may not be disrupted however does guarantee a poor outcome and may be defeated by the competitors.

What is the 5-key success of Digital Transformation?

Culture

Everything begins with trust! Digital Transformation use technological innovation to replace your manual tasks or create new business processes however we have seen traditional organization/people who have a strong culture resists to adopt these new changes.

Companies can overcome these barriers by inculcating trust, communicate to their workforce their digital transformation strategy then create opportunities for dialog. Upskilling and re-skilling is crucial to ensure adaptability and employability in the transition times.

Strategy

One of the important pillars of the Digital Transformation success is a strong and determined leadership  which is required for the development and implementation of the strategy. A key strategy is to focus primarily on improving Customer Experience which is possible by implementing digital business models and services, let us see some of the model types:

  • Marketplace model: the biggest ecommerce Amazon would have no business without internet, this is amongst the widely used business models.
  • Free model: Build great products and released them for free (Trial or Basic lifetime features) once the customers get accustomed then monetization would not be an issue. Many companies have adopted this model for their products.
  • Subscription-based model: Netflix and Amazon Prime have built an amazing base of loyal customers which guarantee a continuous revenue over time.
  • On-Demand model: this model is based on supply and model, it has two sided players, for example in Uber’s case this model works as soon as the drivers (supply) are  offering their services to the riders (demand).

Upskilling

Invest more in the upskilling of the workforce, employees are the face and heart of any organization – in this process of Digital Transformation, it is imperative to upskill or reskill otherwise you might fail to meet the customer’s expectations.

Continuous Innovation

It is in human nature to love everything new! Innovation increases your chances to react to changes and discover new opportunities which become the key for survival. Take Apple as an example which has done a continuous innovations such as the iPod, iTunes, iPhone, and iPad.

Smarter Customer Experience

An improved customer experience is a significant factor in efficiency, profitability and brand equity which must be the result of your Digital Transformation Strategy. I cite from a  global telecommunications EY’s study “Automation and efficiency improvement of all processes will serve customer experience.”

Get started today

The intent should be to nurtures lives through innovation and technology. This vision is going to transform your business and tap into new and exciting opportunities globally.