Visual ChatGPT

Visual ChatGPT is a version of the ChatGPT language model that has been trained to generate text-based responses to visual prompts. It uses computer vision algorithms to analyze and interpret images, and then generates text based on the content of the image.

To get started with Visual ChatGPT, you will need to provide it with an image that you want to generate a response for. This can be done by either uploading an image file or providing a URL to an image hosted online.

Once the image has been processed by the computer vision algorithms, Visual ChatGPT will generate a response in natural language based on the content of the image. This response can be used to answer questions, provide information, or generate captions for images.

To use Visual ChatGPT, you can interact with it through a chat interface, similar to how you would interact with a human chatbot. Simply provide the image and wait for Visual ChatGPT to generate a response. You can also provide additional context or information to help guide the response generated by the model.

Enable Visual ChatGPT on your Windows Machine

Folow the steps:

# clone the repo
git clone https://github.com/microsoft/visual-chatgpt.git

# Go to directory
cd visual-chatgpt

# download Ananconda & create a new environment
https://www.anaconda.com/products/distribution

conda create -n visgpt python=3.8

# activate the new environment
conda activate visgpt

#  prepare the basic environments
pip install -r requirements.txt

# Generate an API Key/Secret Key from your OpenAi.com 
(see image below)

# set your private OpenAI key (for Windows)
set OPENAI_API_KEY={Your_Private_Openai_Key}
set OPENAI_API_KEY=sk-...

# Start Visual ChatGPT !
# You can specify the GPU/CPU assignment by "--load", the parameter indicates which 
# Visual Foundation Model to use and where it will be loaded to
# The model and device are separated by underline '_', the different models are separated by comma ','
# The available Visual Foundation Models can be found in the following table
# For example, if you want to load ImageCaptioning to cpu and Text2Image to cuda:0
# You can use: "ImageCaptioning_cpu,Text2Image_cuda:0"

# Advice for CPU Users
python visual_chatgpt.py --load ImageCaptioning_cpu,Text2Image_cpu

# Advice for 1 Tesla T4 15GB  (Google Colab)                       
python visual_chatgpt.py --load "ImageCaptioning_cuda:0,Text2Image_cuda:0"
                                
# Advice for 4 Tesla V100 32GB                            
python visual_chatgpt.py --load "ImageCaptioning_cuda:0,ImageEditing_cuda:0,
    Text2Image_cuda:1,Image2Canny_cpu,CannyText2Image_cuda:1,
    Image2Depth_cpu,DepthText2Image_cuda:1,VisualQuestionAnswering_cuda:2,
    InstructPix2Pix_cuda:2,Image2Scribble_cpu,ScribbleText2Image_cuda:2,
    Image2Seg_cpu,SegText2Image_cuda:2,Image2Pose_cpu,PoseText2Image_cuda:2,
    Image2Hed_cpu,HedText2Image_cuda:3,Image2Normal_cpu,
    NormalText2Image_cuda:3,Image2Line_cpu,LineText2Image_cuda:3"
OpenAI – API Keys

Demo

The Visual ChatGPT demo should now be running on your local machine. You can open a web browser and navigate to http://localhost:7868 to interact with the model.

It’s important to note that while Visual ChatGPT can generate responses based on visual prompts, it is not perfect and may sometimes generate inaccurate or inappropriate responses. As with any AI model, it is important to use it responsibly and carefully evaluate the accuracy of its responses.

Cheers!

Reference: https://github.com/microsoft/visual-chatgpt

Power Apps – Express Design (Build an app in seconds) – Figma to app

figma to app

In continuation with our first series – Power Apps – Express Design (Build an app in seconds) – Image to app, let us see how Figma app works.

Figma is a vector graphics editor and prototyping tool which is primarily web-based, with additional offline features enabled by desktop applications for macOS and Windows. The Figma mobile app for Android and iOS allows viewing and interacting with Figma prototypes on real-time mobile devices.

Figma to App bridges that gap between design and development, with Figma to App designers and developers can collaborate together to build an optimal experience for the end-users.

As a designer, you will simply create your design using Figma and then you upload that Figma file to Power Apps which will be taking care of converting your design into a working app.

Without further ado, let’s jump into the action.

Design in Figma

1. Go to the www.figma.com and create an account, then on the left navigation menu click on Community and look for a template named NETFLIX (first prototype)

NETFLIX

2. Click on Duplicate to load in the Designer tool – this is the place for all the customizations


3. Copy the URL from the URL Bar (see red circle in step 2 image).


4. Create a Figma personal access token using the following steps

  • On the Figma home page, click settings
  • Look for Personal access tokens, add a token description and create a new one
  • An important point to note is to copy the token
Figma settings
Personal access tokens

5. Go to Power Apps and create a Figma (preview) app

Figma (preview) app

And voila, once more – Power Apps has provisioned from Figma in a few minutes

Netflix by Power Apps

This is a huge productivity (design) saver that will allow businesses to roll out user-friendly, great-looking apps to their users in a short amount of time and effort.


API to app

At the time of writing this blog, the App-from-API feature is to be released in early July 2022.

Coming soon…

Power Apps – Express Design (Build an app in seconds) – Image to app

turn images and designs into apps using AI-powered express design

The need for Digitization is constantly growing and there are never enough resources (Cost-Scope-Schedule) to fulfil all the requirements, therefore Microsoft on 25th May during the recent Microsoft Build event, has introduced the “Express Design – Build an app in seconds” which is a new Power Apps features that accelerate the process for getting started by taking existing content (e.g.: a picture of your paper form, a screenshot of a design, a PPT, a PDF or a Figma design file) and converting them in working Power App with UI and data without requiring the maker to learn how to build an app.

This magic is done using Azure Cognitive Vision OCR model to recognize the text from your image as well as the Azure Computer Vision Object Detection model to recognize the controls on the image whether it’s a text input, a label or radio button, etc.

Azure Cognitive Vision OCR & Azure Computer Vision Object Detection model

After that, even though it’s optional, however, it’s recommended for you to set up the data through dataverse, so you will have your data stored in dataverse.

We have got three different options and we are going to see each one of them

  • Image to app
  • Figma to app
  • API to app

Image to app

Image to app

Let’s get started by building an Image to app

1. On Power Apps, click Create and select Image (preview)

Image (preview)

2. The Upload an image screen appears, where either you upload an image of your own or start with some sample images – in our case, let us upload the following Car Details Application Wireframe

Upload an image

Car Details Wireframe

3. After Azure identifies the component, tag and assign each component as per the requirements

Assign components

4. Next the system allows you to create a new table in Dataverse (recommended), or simply skip it for now.


5. In this step, map the column as per the required data type, review then create.

Columns mapping

And voila, in a few minutes – Power Apps has provisioned the app as per the given input!!!

Car details application

This is a whole new world of possibilities for the citizen developers, those architects or building technicians who are looking for a genuine alternative to building an app – truly it is Empowering every person and every business on the planet to achieve more.


Now, we are going to see how Figma to app works.