Video: Build Hour: Image Gen | Duration: 3144s | Summary: Build Hour: Image Gen | Chapters: Introduction to BuildHours (5.2799997s), ImageGen API Launch (92.485s), ImageGen Capabilities Overview (217.37001s), API Feature Demonstration (494.705s), Image Generation Capabilities (752.135s), Selecting Image Modifiers (1228.5851s), Streaming Image Generation (1300.0249s), Color Customization Demo (1376.21s), Image Generation Process (1460.16s), Web Search Integration (1591.87s), Gamma AI Demo (1952.625s), AI Image Evolution (2012.985s), Gamma AI Presentation (2162.6s), Theme and Generation (2307.41s), Maskless Image Editing (2437.14s), Q&A and Conclusion (2650.555s)
Transcript for "Build Hour: Image Gen":
Welcome back to Build Hours. I'm Christine on the start up marketing team, and today, I'm joined by Bill Chen. Hi, everyone. I'm Bill Chen. I'm a solutions architect on the startups team. We have a really fun topic for you today on ImageGen. But first, I want to give, just a little quick refresher on the purpose of build hours for anyone new joining us. The goal of build hours is to empower you with the best practices, tools, and AI expertise to scale your company using OpenAI APIs and models. So on the right of your screen, we have both a chat function as well as a q and a function. Would love for you guys to submit any questions you have as you're building with us today. We'll also include a link to the code that we'll be using so you can actually follow along. And, again, anything comes up, feel free to put it in the q and a. We have our team today in the room ready to answer any of your questions. We'll also leave some to answer live towards the end. I also wanted to drop our home page, link on the bottom of this screen where you can find any upcoming build hours. We are constantly taking your feedback, your suggestions, and adding more. So we actually have three more coming up in June and July, with some new topics. So definitely check that out. To set the scene a little bit on ImageGen, we released ImageGen first in chat g p t back in March. I don't think you could have gone on social media without seeing one of these Studio Ghibli images. They were all over the news feed. In just the first week alone, we had 700,000,000 images and over a 30,000,000 users, generating images. So this was super exciting, and we knew we wanted to take it a step further and really get this in the hands of developers. So less than a month later, we launched in the OpenAI API with just one directive to build cool stuff. And we saw anyone from startups all the way to Fortune 500 companies really taking ImageGen and bringing this to market, incorporating it into their tools and platforms. We actually have one of our start up customers, Gamma, here with us, who will be demoing some of the things that they've they've built, later on in, in this build hour. So here is what we're going to be talking about today. We have actually new capabilities, even more new capabilities, which will go through things like text rendering, world knowledge, multi turn editing, and then a really fun demo. We're gonna be live building a photo booth so you can actually see what ImageGen, can generate. And then as I mentioned, Gamma will come on stage. We have their head of AI engineering with us today, Jordan, who will, share a bit more on Gamma. And then my favorite part is q and a. As I mentioned, we really try to get to all of your questions, but we do read all of them, and incorporate your feedback. So feel free to to answer anything, and submit your questions on the right. So without further ado, I will hand it off to Bill. Thank you, Christine, for those, introduction. And, wow. It's really been a while since I've done something like this. The last time I've done something like this was, back all the way back in high school. And fun fact, I think I believe Christine and I both done something like this, back in high school. So it's really nice to be back but just in front of a larger audience. I'm assuming that most of y'all probably have heard about ImageGen, and better yet, played around with it yourself. For clarity's sake, it is helpful to define what ImageGen really is, how it is different, from our previous generations of text to image models, and we'll also do a quick demo, walk through by yours truly. And afterwards, we'll, let Gamma take the stage and let them show you the cool stuff that they have built. First of all, how is it different? What is ImageGen? ImageGen is different from the previous, image generation models like DALL E. You probably have heard. Those are stay those are diffusion based, models. And the main difference here is that it is four zero native. ImageGen is a four zero image in the image generation model, meaning it is the same GPT four zero architecture behind the scenes powering everything. And the generation of the images happens auto aggressively and means that it generates the image the same way that GPD four o generates text. The generation of the image happens almost like a next token prediction. Well, it sounds like it shouldn't work that well, but, in practice, what we found is that it works, quite well. And it brings you a host of benefits like being able to render text on top of images well, improved destruction following, granular image editing, as well as editing based on image input. Here is just a quick overview of all the benefits that we have to offer. I won't get too much into it. For fear of it sound too salesy, feel free to take a pause here, take a quick screenshot. All of this will be available as recordings for you to take a look at as well. And just a couple of examples on what, the improved text rendering could look like is I can have handwritten text, typed text, on different surfaces. And I still remember since we're on the topic of high school, I still remember back in high school, I I ran, student council election four times in a row. Did not get elected all four times. But in those process, I, had to create a lot of, posters. And I remember having to put ten hours at a time into creating some posters like that. So, in the rare event that we have those of you in the audience are looking to do something like that, you can do that within ten minutes. So hopefully that saves a lot of your time. The added world knowledge is also helpful here. We found that a lot of folks have been making great educational, materials like science posters and, science posters that explains concepts like photosynthesis here as an example shown on the slide here, and cellular structures directly with simple one line instructions without additional context. And ImageGen was able to just zero shot all of that because it is based on GPT four o and GPT four o has all of those world knowledge imbued with it during the training process. And you can also make photorealistic renderings on real world places because of, those raw knowledge. Image from inputs is also nice. So for example, here, you can use multiple images combined with a prompt to generate, a and then a a final image that incorporates all of those image inputs. So as you can see here, combining all of those images into a cohesive gift basket. And just to take you out of the presentation really quickly, we have we do have an image gallery, on our website. Here, you can see, which one all the images, some of the images that I've shown to you and what prompts and as well as input that has gone into, into generating those images. And some of those, some of those inputs have image inputs as well combined with text input worth playing around with it yourself. Oops. Going back to our presentation, so I've talked enough about the capabilities, that those are all available in TrackGBT. How can you build anything with it? As you might have suspected, these are available in the API as, GPT, image one. That brings the experience, to the API. So I will talk briefly about its capabilities and explain to you how, you might best use it. Just to get things out of the way, we have actually released a couple of new features, last week. Last week, we released, those new exciting features as a part of the responses API, improvements. So ImageGen now is available as a built in tool inside of responses API. So those improvements include streaming, multi turn editing, multi tool image generation, as well as masking. I will get into each of these, here. Streaming, quite self explanatory. ImageGen does take a decent amount of time to completely finish, generating an image anywhere between thirty seconds to a minute depending on your settings. And to to enable you to build, responses, user experiences, we've added streaming feature in the responses API that allows you to stream partial renderings of the image as they become available for the before the full image, gets completed. Multi turn editing is also, incidentally, quite a self explanatory. You can pass in an image and, through its ID, wholesale or wholesale uploading it, in order to combine it with an additional text prompt, resulting in a different image. So how it works is responses API provides you with an image ID or previous response ID with every response, and you can pass that back into the next response, so on and so forth, resulting in a multi turn editing, user experience. Multi tool image generation, in responses. So now you can use other built in tools together with ImageNet as well. So rather than explaining to you with this, slide, I think it's best, to just show you a little bit how we can play with it yourself. So let me open up, the Playgrounds, OpenAI Playground. And so here, is those of you familiar with it, this is where you can play around with the latest models that we have, together and prompt those models and see what they produce. And make sure you're on Responsys API because Responsys API is the one that, offers built in tools. And you can select ImageGen tool. You can add it. And then we can also select, web search tool, just for demonstration purposes. So as I mentioned before, ImageGen has a really capable, model of the world. It has internal knowledges of the world, how the world works, but it still doesn't have access to real time information. For those, we can use the web search tool to look it up on the Internet. And so what that would look like is, we can do something like look up the weather in New York City right now and generate an a poster image with the information. And here, I can send in, this prompt. And as we can see, by giving it access to web search tool as well as image generation tool, Responsys API was able to decide intelligently itself using the GPD four one model to call the web search tool. And we have looked up the proper weather. The weather, as we can see here today, is the twenty ninth, low of 61 degrees Fahrenheit, high of 70 degrees Fahrenheit, cloudy in the morning, the intervals of clouds and sunshine with shower, places in this afternoon. And as you can see here, we have the ImageGen tool starting to generate and start to stream in this response, with up to date information directly. And so all of this is available, directly out of the API itself, and you do not have to define, your custom functions to implement this. So all this is available out of the box. Masking, also quite self explanatory. You can create masks and, build, in painting experiences. So for example, here, we create a mask and indicated that only certain areas is available for editing. And as you can see here, the only that area was edited and nothing else was changed. We We have a flamingo right here. Image gen used to be just simply text to image or image generation as a whole used to just be text to image experiences. With, these advancements and modern capabilities, for now, we we can truly say that design is can be thought of as a dialogue. That's enough talking about, the capabilities. Now I would like to just quickly go over some of the ideas, that have come up on, what you can build with it. Just to dump a couple of use cases ideas for you, you can use it for marketing and brand design, generating posters, marketing material of products on the fly. It's never been, this easy. Ecommerce as well as retail. This image was generated. As you can see here, I literally pulled this out of the galleries that we had. And, this is the image, if you recall, was generated by combining an image of a model plus the product of the dress, image product image of the product, which is a dress that she's wearing right now. Now can you imagine an experience where you're let's say you're an ecommerce store that sells dresses. How cool would it be that you can let your customers, try on the style before, before they buy through their own photos? And you can certainly make a lot of cool educational posters out of it as well. As a child, I kept, I used to read a lot of books. And, yes, I would love to have this. This is a bit of a meta slide, since I've used ImageGen to generate this image to put in this presentation to tell you guys you can put, you can generate images to use for presentation. So it's, quite self explanatory here. So great for presentation. I this is a bit of a personal idea for me as well. I love games, and I tried to build games on my own. When I was back in high school, I used something called RPG Maker, I believe. And I think the biggest pain was finding the right character assets, the sprites to put into the game. And if you're like me, you know that pain. Now is the perfect time to go back, to these ideas and finish building that game, that you wanted to build. This is definitely not an exhaustive list. These are just what I can think of when I was putting together this this deck, in under an hour, in a bit of a sleep deprived state. But you guys are probably smarter than I am, so I'll let you guys, take it away and build cool things with it. And but definitely let me know what you're trying to build. Now that we're talking about building, it is useful to get a little bit into the best practices as well. And also had a lot of fun making this slide as well because you can obviously see that I've generated a lot of, those, images myself. So choosing the right API format, we offer ImageGen, in two API formats, responses, as well as image. Image, you might be familiar with, when you if you have played with it, using it to generate images with Dolly. We recommend using images for single turn straightforward text to images, tasks, but only that. And for anything else, actually, for the I think for most use cases, I will recommend, using responses because it has built in multi turn, multi tool, experience that might require and experiences that might require additional reasoning on top since you can also call a base model that orchestrates all of those tools together. Next, you can also customize the image output as well. Size and quality affects the number of tokens that gets used, and the model is built based on tokens. You can use it, to to to you can fiddle around with the output, parameters, to get the the the format that you wish and want. Also, a couple of things here is you can only use transparent background, with certain formats like PNG as well as WebP. All of this is available in our, documentation, on ImageGen. And I think this last part seems straightforward, but folks often forget is, the user experience. ImageGen does take a little bit longer to generate. How should that user experience be when the image is generating? Should it be streamed? These are all the questions you should think of answering, before building. There are also certain limitations by ImageGen as well, for ImageGen has as well. I will call those out, in the next few slides. But knowing what those limitations are and putting guardrails into place is also something that folks often miss. So limitations here. Speaking of limitations, there are a couple of them and it's worth calling it out here. For one, the generation speed is quite a bit slower than before. But with streaming, you might be able to improve the user experience there. And text rendering is good, but definitely not perfect yet. And the other day, I was trying to generate a poster with, some Chinese characters on top and then someone who speaks Chinese. I wasn't able to understand a couple of those characters. So for languages, text renderings that is not English, you might run into some of that as well. Consistency, for multiturn image is good, but also not perfect either. Also, one last note on moderation as well. As with all our models that we make public, we put in a significant amount of consideration into safety and, moderation. All generated images will be done in accordance to our content policy as publicly available here. That means no violence, abuse, or anything dangerous. There's a moderation parameter, that you can pass in. But low is as low it'll get. And you can fine tune the sensitivity yourself as well. So, so it might, so even then, it might it still might refuse, depending on certain, generations and thereby might may not be a good fit for your use case if you are looking to generate certain types of content, despite good intentions. So for example, artistic, context. Now that we're done with all the concepts, here we're getting to the fun part. Let's build something together. Well, I've already built actually a lot of it. My so I'll be talking to you through it together, but we'll also be adding a couple of new features as well. So let's get right into it. Oh, here's the demo. This this is a slide where we get into the demo. And great. And so, let's see here. Let me just show you what the front end looks like. Alright? So here is an app that, we built for our Exec Summit. So Exec Summit is basically this event that happened three weeks ago where a bunch of Fortune 500 CEOs came to San Francisco to see what we had built here at OpenAI and discussed a lot of the the things that's happening. And here we built a a photo booth app, and I repurposed it for this, demo. But I've also added a lot of the cool new features that I talked about, during the presentation. This is basically a very simple Next. Js photo booth app. And what I can do here is I can upload a picture of myself that's pre prepared. That's that's me. It's yours truly. Look at the way he smiles and look at the camera. Yeah. Sorry. It's a little embarrassing. Yeah. And so I have a set of modifiers available for for me to choose here. So this is where I turn to Christine, my partner in crime here, and see which one, that's Yeah. Yeah. Let's okay. So we we have to do the Ghibli style Oh, yes. Kind of a a a take in. Yeah. Knitted cozy scene Yeah. Definitely. Mhmm. Japanese anime movie poster. Okay. Yeah. Okay. And last one, maybe the minifigure? Awesome. Okay. Those sound amazing. So why don't we kick this options? I've said that, human generation does take a couple of seconds to complete. And this is also where we clasp our hands together and pray to the gods that live demo works. Again, we don't we don't can any of our demos. Oh, as you can see here, things are actually looks to be streaming in. And this is where it's helpful for me to open the Chrome developer tab to see what is happening behind well, this is sort of behind the front scenes. Behind the scenes, I'll also get into the code just in a bit. So as you can see here, this is the first, new feature that, I've built into, this demo. This wasn't available at the exec summit. So this already is something that was new, since last week. As you can see, all of the images are streaming in. So at the exec summit, the way it worked is you will have to wait until the last image gets generated. So folks were left twiddling their thumb until the last image is generated. As you can see here on the Chrome developer tab, the way how streaming works is we're passing in partial images. We have two types of images that we're passing back to the front end. We have partial and we have final. Partial images are basically like the ones well, self explanatory. They're partial images. With the ID, we're able to upload update the images at in every single each one of those panels. Yeah. Now that those images have been generated, let's look at the second thing that I've I've added, to our little demo here. So let's take an image of this Yubilee thing here. And, and and let's say we wanted to make certain changes, for this image. Two weeks ago, this demo, you weren't able to do that. But now because of responses API, we're able to add, additional prompts to change it. I'm kinda curious. What's, I guess, everyone's favorite color, here? I'm kinda I want to, give it to the chat. Yeah. So we actually had an overwhelming amount of people who voted for green. Just another one came in. So I I think we have to change this, background to green. Okay. Well, the background already seems pretty green. So let's, make it more green then. Make the background more green and, actually, just make it a darker shade of green. And, let's see. What else can I do with this? Oh, we can say something like, keep everything else the same. And let's click on modify. And as you can see here, we're passing the image back for it to generate a new image. While it's modifying, let's not twiddle our thumbs and wait for it. Let's dive right into the code, on what we have, built. So for the the initial generation, I'll show you the code logic on what goes through. When we click on one of those buttons, the front end passes back the ones that we have selected that then goes through a mapping here that gets mapped to the prompt. I've this is a very basic prompt, and I've typed it out in, I think, ten minutes. And it's completely by code it out. So I'm sure there's lots of room for improvement. This is again where I mentioned that this repository is already made public, and you can access it in our build hours repo. I have made a little bit of change, but the updates, I will update right after the, right after the session. And so you can look at those prompts yourself. And you can pull those, the code, host it yourself, prompt engineer to your heart's content. And once the, the mapping has completed, we're we generate the complete system prompt that says maintain the original composition of the subject, generate image of the subject with the following modifiers, the map prompt, maintain the original composition of the subject. And afterward, this is where the magic happens. As you can see here, we're using the responses API. We're creating a response, and we are passing in our input image, the basics before encoding of our input image, along together with the prompt. And here we're, giving it tool, the image generation tool, and we have a couple of parameters that we pass in as well. For example, the size, the quality. I definitely want high quality. Partial images, three images. And also the other thing that we set here is stream is true. So what this allows you to do is you can open a stream and start to pass the different events that get sent back. And so we can pass the partial image and send it back to the front end. And we can also take the final image and pass it back together with the responses ID that's associated with the, the image. And the other thing to call out here that's worth mentioning is the tool choice here, I've set it to required because we are using responses API. So if if you can you can prompt it certainly to make it call tools, but by setting tool choice equals to required, it is forced to make one or more tool calls. And since we've only given it one tool, it has access to, that means we're the response API is forced to call image generation tool. So that's just one, caveat to keep in mind when you use the responses API. Oh, let's get back to this image that has finished generating. Wow. That sure looks quite a dark, green color. Almost look like nighttime, if anything. But it's not a very good showcase of multi multi turns. We only showed one turn of it generating. Right? So why don't we, add something else? We're we're so I I I can't exactly tell where where this handsome gentleman works, Christine. Like, where does he work at? I'm sure. I think he works at OpenAI. Okay. Yeah. Of course. And how can we fix it? Maybe we should, add a a, OpenAI logo to his shirt, like how it's showing, on my hoodie right now. So let's do something like add OpenAI logo to his shirt, And we can click the modify, once again. And, hopefully, this works. But in the time it's generating, we all we can also shift our attention to how the how the edit, logic works. So it is in another route dot JS file. Again, for those of you well versed in that you guys are probably more well versed in Next. Js than I am. You guys can probably just download the repo and look at it yourself. Or so for for for the other folks, this is, basically where all the logic resides for this particular endpoint, the edit endpoint. And here we also see a very similar, very similar code logic as well. We have the input text, which is the prompt that we passed in, and we have the image URL, which is the base 64 encoding for the image. One thing that I would love to call out here and for you guys to take note notice is that, if you recall from the presentation that I gave, you can actually instead of passing the, base 64 encoding up the image wholesale back to the back end, what you can do is you can just pass the previous response ID, back. So there's no right way of doing this. That way, we can manage all of the states for you, and there's less back and forth, whole images that you have to send back and forth. And, just to give you guys a quick look at how you can do this, you can go to our, image generation docs here on our website, and you can take a look at multi turn image generation. And here, we provide you with an example of how this is done. This is basically we have a response. This is the first response, and we have a follow-up response. Instead of passing in the entire image ID the entire image in base 64 encoding, we're just passing in the response ID. And here, this will basically work like this. So there are things you can do certainly to simplify. And here we see that that I that is very clear where this, lovely gentleman works. And, one other thing that I would also like to add here is, what if I want some access to live real world information that's only accessible on the Internet? I've shown you how we can do it on in the playground, but we're all builders. So that means we have to code. Where should I put this exactly? I'll take a moment here and, you know, wonder if and, you know, if you have assumed folks have figured it out. I see that a few folks have indeed figured this out. But just to give you guys a hint, it's not here. It's not here. And, also, I put those comments there because I kept, messing it up by putting it in the in the wrong place. As you can see here, we have a tools, field and it is an array. So this is where you can identify your custom, your custom defined tools as well as hosted tools. And here, what we can do here is add oh, looks like cursor just read my mind. And we can add a one line description here. With this, we basically added capabilities that uses web search tool. And let's look up something that only, that some knowledge is only available recently. I just recently got into, NBA. Well, I didn't get into NBA, but I got into following basketball. And since the Golden Gate Warriors has lost, I've been following along the Knicks, since I've lived in Europe for five, years. We don't talk about Knicks and Pacers, but I'm wondering about Knicks versus Celtics, how how they did. So let's we can do something like look up the latest score of Knicks versus Celtics and, add that score to the background of the image. Keep everything else the same, and I can click modify here. And since this is the last part of our demo, we are once again going to clap our hands together and pray to the demo gods that this will run properly. So let's give it a second and see what it generates. Oh, and in in the meantime, I can just talk to you guys a little bit of what exactly is happening. This is basically the same on how what you saw in the playground. If we said something like, oh, look up the latest score, it's going to decide that it needs, to call a web search tool before it can generate, an image. So that's what exactly what it had done. And, indeed, the Knicks have beaten the Celtics and gone forward to the Eastern Conference finals. And with that, that concludes, the demo that I have prepared so far. Again, I have this entire repo available, as a part of our build hours repository. So feel free to play around with it, hack it yourself, maybe even build a product to production with it. Really excited to see what you built with it. And now pass it back to Christine. Yeah. Of course. I am really excited for this next part. We are welcoming Jordan, the head of AI engineering from Gamma, onto stage with us. So, Jordan, hi. How's it going? Thanks for joining us today. It's going great. Thanks for the intro, Christine. Of course. So I'll let you take it away. I know you have some, you know, really cool features that you wanna show. So, yeah, feel free to to share your screen. Great. So for my talk today, I'll actually be showing a lot of very similar kind of technologies that Bill just showed, in Gamma and how we use them to power our app. So right. So at Gamma, our mission is to help bring your ideas to life. We do that through helping you make presentations, documents, websites, and we recently added, social media posts as well, and we do all this with AI. And like all these mediums, Gamma is a visual platform. It's a visual medium. So the kind of three pillars for us have always been charts and diagrams, visualizations and layouts, and lastly, AI generated images. And over the last two years, we've generated a lot of AI images. We do about 700,000 AI generated presentations per day. Every presentation has several images in it. So we just recently crossed 1,000,000,000 AI generated images through our platform and through, various providers, one of them being, the new ImageGen model. But with this, like, historically, AMGs have had a lot of problems. We were definitely an early adopter of using AI generated images in our presentations, and we kind of kept saying, like, one day it'll get better. Kinda through that process, we've had, you know, issues that I think everyone here has seen in some form or another, hands being weird, limbs being weird. This third image of OpenAI Bill hour was something I generated actually yesterday. And even today with some of the newer models, that are faster and cheaper, like, it still can't get text right. But in the last month or two, I would say, there's been a lot lot of good news with a lot of new models coming out. That's AI images have gotten a lot better, so much so that we're able to use them in new context that we weren't able to before in our presentations. So this is an example of what AI image from twenty twenty three November looked like when we asked for, I think, delicious sushi. It's kinda weird. I don't think the egg goes in the sushi like that. And to contrast, this is what yesterday looks like. Much better. The quality is just outstanding, and it's now something that we can, with more confidence, include in the generated presentations that we make through gamma. So as a part of this, I'd like to show how we use AI generated images in gamma and give a brief overview of just the general platform at Gamma. So I'm gonna create a presentation about today's topic, the OpenAI build hour. So with Gamma, you can easily create a presentation from just a single line. So So we'll go to the generate option, and I'll type in OpenAI build hour image gen, and then I'll get the date to tell it it's this build hour. So one of the other interesting things that is not related to ImageGen is actually web search. So having web search as a native tool oh, looks like this may be stalled out. I'll try that again. So having web search as a native tool built into language models really helps with topics like these. Things that are past the training dates or current events of language models before would just not be able to show up here. It would basically be a hallucinated outline of a presentation where, all the details will be kind of made up. But as you can see here, when we search for today's build hour, we actually get real information based on the actual web page where all of this came from. So I'm gonna just make a couple tweaks to make this a little better. I'd like a maybe a separate page for speakers, and then I will add some titles. So I'll say I can make this a little bigger. OpenAI. I'll say, like, solutions architect, Then I'll say for myself, gamma head of AI engineering. And because I don't want our gamma to AI generate these people, I actually have images already that I can include. So I'll put one of Phil. I'll put one of me. And this is basically telling our AI, use these images, instead of generating new ones. So, hopefully, it does that. So the next part is, once I have the outline that I'm okay with, I will choose a theme. And for this demo, I've actually made a OpenAI theme, so I'll search for that. I'll select that, and then I'll make sure to use GPT image model for this. So, with this theme, it's attempting to match the styling of OpenAI, but it's also attempting to match kind of the imagery, kind of the abstract gradients that I think, the OpenAI brand uses. So I'll go ahead and generate this deck. And like Bill mentioned earlier, image generation is definitely a slow kind of AI operation. So when we generate decks, the language model is actually pretty quick. It's able to generate the content of this, and we're usually waiting on images to be generated. So as this deck is streamed in, we're taking the outline that we had before, the one that used the WebRidge tool to find information about build hour, and then it's passing it to another language model to do the full deck generation along with choosing the layouts and choosing the visual kind of representation that we want. So as we can see, it looks like it used the images up here. So I'm gonna go ahead and change the layout of this. I'm gonna go ahead and get rid of this our audience thing. I think we just want this to be about speakers. And then, one thing I can do here is I can switch the layout. I think, this one's probably a little better. And we can go ahead and look at some of the images it generated. So this one is even not the best with text. This is quite a good image, I would say. It got the text right. This is an interesting one. So a lot of the images, I would say, kind of lean more towards, showing, like, full websites. So the last part I'd like to show and another feature that we've been building on top of ImageGen is the ability to do maskless editing. So if we have an image we don't like, we can open this feature to open our chat with AI, and then this is actually doing a chat with the context of this image. So one thing I can try is, regenerate this image to be an abstract gradient, and we'll see what this does. So from this menu, we can do one of two things with images. We can either create them or edit them. So in this case, we're creating a new image because I don't think this is necessarily an image that I want to tweak. It kinda missed the mark. But I think in other cases, we can show where image editing might be the right solution for this. So I believe this is using the GPT medium model internally, and for us, we've seen latencies about thirty seconds. I am definitely looking forward to implementing some of the streaming functionality as I think that would make this a lot better experience. So I'm gonna pick this one, and then that's gonna go ahead and update in my presentation. And then let's let's try this. Let's edit this to just remove the text below the laptop and see how this does. So this is using maskless editing where before, when you wanna do image edits, you needed to also supply a mask. This, it just takes your text and is able to interpret it, figure out where to edit it, and it edited it. And that was pretty good. Let's also remove the top bar and see if that works. Thanks. So it's gonna remove that, and, hopefully, we'll get an image with just the laptop. There we go. Obviously, if this was a real deck, I'd probably make more tweaks, but I think this is enough to kinda show the, development cycle of building with Gamma where you can supply kind of a general style and a general outline to the deck you want, but then use AI image editing to actually refine these images. The last thing I wanna talk about is maskless editing. One of the big ones we've had really recently, which I feel like probably makes sense to call at a OpenAI build hour is, we actually switched from other models that do maskless editing to GPT image. And we saw overnight, basically, a 27% improvement from users, like, using this. These are based on user rating images after they've been edited of these types of wins where it's basically a one line code change, and we get huge improvements. So we've been really happy about that. And if you'd like to try Gamma, you can do it for free at gamma.app. Alright. I'll hand it back to you to Christine and Bill. Thanks so much, Jordan. Actually, we had a question come from the. I'm sorry, Christine. You cut out on my end in the sound. Could you repeat that? Yeah. Sure. We had a question come in from the chat. The question was, if I want to make a strategic sales deck, would you recommend I get the strategy from ChatGPT first, then give that to Gamma? Yeah. So we actually support many different, ways to kind of import your content into Gamma. I would say if you're in a more professional use case, a lot of people use our paste in mode. So that allows you to paste in a full outline or a full, you know, pages of research and have gamma either condense it or preserve it into the pitch deck you want. We see a lot of people using, language models like ChagibT to first synthesize their thoughts, do research, generate images, and then just bring that into Gamma and have Gamma handle basically, splitting up in the slides and doing the visualizations. Got it. Thank you again for joining us. We will move into live q and a now. Let's see the first question. Okay. I'd love tips around consistency and granular control around things like object references style. Yeah. I think that is a very common question, Christine, and I'm glad you guys have asked it. And I think, couple of things here. Definitely for consistency, granular control things, there there are a couple of knobs that you can turn with the new image and API, especially because now it is native. Prompting is actually very, very important. So what that means is you need to follow the the best practices for prompting as well, be as specific as you can, and do not give conflicting instructions to ImageGen. And then the other things like granular controls like object references, styles, reference is one thing that I'll tease pull out from this question here is because we offer you to be able to pass in images, in as a part of the input, reference images are actually hugely helpful to inform what it should be able to generate. So if you have reference images of objects that you want to put in a certain C styles and you want, the image to be, generated as a WIS, you can pass all of them as image inputs. So prompt well as well as providing references, those are, I think, the biggest two levers for this particular question. Shall we move on to the next one? Yeah. And how would you resolve how how would you be resolving mutation problems especially coming because of the prompts? While speaking of prompts, I think now that you're talking about mutation problems, so mutations is it sounds like a scary word, but I assume you meant things like unexpected generation results based on mistakes and prompts. So it actually is I will say it's actually easier than most folks imagine to be able to spot those, issues and, and be able to fix it. So if there are contradictions inside of your prompt, definitely fix those. One tip tick one tip I will offer you guys, to do is before I generate images, I actually put all of my prompts through, GPT-4 one or o three just to make sure that I think nowadays the models are smarter than me. At least in anything writing related and prompts is is writing. And, that definitely like, you don't have to do it alone. Do it together with the other models, I guess. Great. And then the next question is what would you suggest, the best practice and quality and cost if you want to generate a story with text interspersed with images that are all consistent? So another great question. And then definitely now we're getting to use cases. Now you're getting me really excited here. Generating a story with text and interspersed images. I'm not exactly familiar what your specific use cases are, but I have a few concrete use cases about mine. For example, if you want to generate children's books on the fly or educational materials on the fly here, definitely, GPT-3 well, ImageGen would be a great fit for that. A couple of things that you can do here, I'm just sort of, spitting out my thoughts as as I come here since I didn't really prepare too much for for this as a as a question to come up. But, first couple of thoughts here is, first of all, everything that I've just said before, reference images and passing the images back. Because of response API, you can generate images in a multi term manner. So what you can do here is you can either pass images back as a reference or you can pass in the previously generated images in as, image generation IDs. And responses API will be able to take all of those, and, be able to to to to see what you have previously generated and put together, the bits and pieces, and be able to generate, images that are on style as consistent with style. As for other things like quality and cost, that is also a great question here. The recommendation I will provide here is use the best quality, highest quality that you can use, latency and cost permitting. Because that way, you will be able to see the ceiling of what, Imagen model is capable of. And then after that, after you are sure that your whatever use case you had in mind, this is ImageNet is a right fit, despite the limitations that I have I mentioned during the the presentation, you can start to tune things like, output formats as well as the the quality, go from high to medium or low, and see which ones would be, a best fit for you. Next one is possible to not edit part of the image. Leaving it intact, I want to isolate the image, the area being edited. Great great questions. And, if you, remember, I mentioned very, very briefly about masking because it's a little bit sort of hard to to put together a demo around it. But very easily, you can do this, by, putting passing it in a mask, image, together with image that you want to edit. The mask image basically has an alpha channel transparently. Well, I can't exactly recall which. It's all available in our doc. I believe the transparent layers are the area that you would like to be modified. And the other layers you the the other, parts that you don't want modified the alpha channel, you leave it, to be, fully dark, basically. And that will be, it it could be the other way around. Tell you with it yourself. That would be my recommendations. Awesome. That's all we have time for today, but we really appreciate all your questions. We are reading them and, taking your feedback like I mentioned. So I'll leave you with some carding gifts. There are some links that we will be sharing out in an email afterwards along with the recording. These are really helpful for you to try out. And as Bill mentioned, he's gonna be updating the code that we used today. So I saw some requests come from you guys on on different tweaks and wanting to play with it yourself. So, with that, we will see you June 17 for our next build hour on voice agents. Thank you everyone for joining us, and we'll see you then.