OUR LATEST ARTICLES

Video: Coral brings AI to the Edge

&nbsp;

https://www.youtube.com/watch?v=nv9if4YY8v8

Our next talk is we have Naveen. Here on the list there good. So maybe we're kind of phénomène. How are you going to shoot and think that? Well, we're having a little bit of an audio problem in there with your audio.

Can you hear me? All right.

We're going to little back feedback, I think, or something like that. Maybe maybe it's just me. Let me finish up in the introduction and then we'll see how we go on set, OK?

So we have Naveen today. And from the Google Coral team, he's going to speak to us about the Coral framework and devices, Coral is a complete tool kit for building products for local with local A.I. capability that is efficient, fast and offline if required.

Nadeen has a master's degree in intelligence systems and on the Corel team works as an application engineer, helping customers migrate their applications to the core platform. And when he's not helping people bring AI solutions to the edge, he lives on the edge and does horseback riding and motorcycles. And with that, Nadeen, I'll turn it over to you.

Thank you. Thanks for the introduction and get involved right now.

You're still a little over getting a little feedback, maybe I don't know what that is exactly, what maybe needs to happen.

I think. I think.

While Naveen's doing that, then, if you guys have any questions, please put them in the media that put them in the, you know, the thread there and we'll go ahead and answer them.

Hopefully, he can get his distortion fixed up and reconnect and things like that. I'm not quite sure what's up with that. Maybe the Internet gods are just not working well with us today. But we'll get that we'll get that sorted out here in just a moment. So how many of you have ever looked at have any of you looked at the Coral products and things like that? I spent some time with them and things like that.

In fact, I have one. Let me just show you back here just to fill in there, this little board here that I have, which is exactly the same size of a Raspberry Pi.

This case actually that it sits in is a Raspberry Pi case. So if you're familiar with one of those is capable of doing some pretty fantastic things you've probably heard of Google has these TPUs and things like that, and they've been able to shrink it down to some sort of -- I don't know how they did the magic here, but you can kind of see the processor in this device and it's smaller than your fingernail, your little pinky fingernail here. And they're able to get TPU-like performances. So it's amazing that they can do that. I don't know how they do it, like I said. And but I've seen this particular device when I was doing object recognition, run up to like 65 frames per second, detecting objects in a video stream.

Oh, looks like Naveen is back. And so can you hear me better now?

Oh, much, much better. I'm going to drop off and let you give your presentation.

Should be taken from here.

Yeah, you're up, so go ahead and start whenever you like. Oh, sure.

Hello, everyone. Thank you. Thank you for joining us today at Devfest West Coast, I'm from the Coral team. I'm an application engineer developer helping customers to integrate their solutions onto Coral edge TPU. So I'm going to explain today, will introduce you to Coral devices. Explain the eco system around the Coral device, and what are the hardware requirements that you need to use TPU and run inference of an AI/ML. So let's start by knowing why to this call.

So the main reason why we why the Coral platform is built is to democratize machine learning and also provide a lot of privacy for the users at the edge with increasing in the privacy concerns running, AI-at-the-edge is the most important thing.

So. the Coral Platform provides privacy and machine learning applications by saving a lot of power and they're efficient when you deploy them on the embedded application, saving power. And most interesting thing is Coral can -- all the applications are offline and you can run this without Internet, especially like, say, duration, you should not be stopped by Internet for your applications.

So what does the Coral platform consist of? Consists of some hardware components and software tools as well. So we're taking a look at hardware components. We have HTTP modules available in different form factors and and two speciate and USB connection. And we also provide some ambient sensors to sense audio, video and environmental variables like temperature, moisture and also motion sensors to detect any further action like that. Ship coming to software and a ML tools we have an OS called [[]], which is used on the left coast to that in which which points of the interview and the other operations around it. And concerning gold, we have an compiler that converts into a flight model and mapped all the operations to you and which takes off switching the workload between back and forth, TPU HTPU. Of course, we do have documentation from this providing data sheets and schematics to develop your own applications and also a huge list of example projects, which can be a great resource for you guys to start your to start prototyping your products.

So these are the current products available right now, distributed around around the world in 36 countries. Dev Board and USB accelerators are two of our prototyping devices, which aren't designed for production, but we have seen a couple of people, a couple of customers going into production with these devices. But these are designed for prototyping, and we do have.

Some modules, which is a which is a system of technically this is a default without any preference around it, so we provide you schematics and all the information needed to fit this into your own and build applications. And we do have a PCI-E accelerator form factor which can be tapped into your local host machine or your industrial PC enabling your industry entity to perform AI at the edge. And PCI-E accelerators are industry-grade activators. Create accelerators helping the users to make sure they can build out AI-at-the edge. We also have the camera model and which is the interface and environmental module, which gives you information about the lady light, which can be used for Iot applications.

Yep, and the.

This year at CS in Vegas, we announced two new products. One is Dev Board Mini, the same version of the report that you see on our website, but it's a very low power form factor and a small. A small TPU, but still uses the same edge TPU. And bigger versions of the summer to give away front of it, and we are also looking at looking at developing two gigabytes and four gigabets dev boards but all these chips have the HTTP objective is to build on it, which consumes about to watch delivering.

The performance of four four tops.

So here is an upcoming product of, as it is called, HTP, a module.

This is a smaller version of institue which can be shouldered by E.S.P 2.0 interface are mini PKU interface. So.

This is another in a production production model of stage debut, and this comes along with the teamwork and everything integrated into this was developed in collaboration with Maratha.

So now let's see our software and software libraries officially, we support Amanda lawyers and the lawyers is on the record and for many, which is based on Debian Linux. And so this platform is used by most of our prototyping solutions, which isn't recommended for production. For production, we support Yapta and Android, which can be built on the same spot and used in production.

So runtime defense requires deflater on time, on the embedded devices, it can be either with USV or TCAP or any form factor, every form factor requires the same runtime quality of light. So we currently support UNbased architectures and AMD architectures.

So most of the rest of our lives sometimes are built in the so far.

Very recently we started supporting Hezbollah for ESV and PKU foreign fighters.

Sorry.

So to influence the views of flight interpreter and which can be installed using tape and supported by the flight patrol, so all you need to do is important, your flight interpreter and law that your delegate to run it.

Sauve.

Coming to software and models, which is a main part of enabling machine learning at the age, so. The compiler and the runtime are the downloadable versions, which are used to enable inference, the and the other required. That we do provide a huge list of checkpoints and HTP important for users to build prototypes, to do experiments with our existing models. Which includes a state of the art networks like mobile address, net encryption and not. And we we do facilitate unrevised transfer learning, which is performed on the C.P.U.

And we have QuickStart.

They compiled models and times for learning. Learning of. Can be done in Google, Calabar and not anywhere, and then compile the models and pass it to the coroner called.

So most of our software documentation is located in Quadrilogy, either official website where you can find all the software documentation and the models and the resources to get started. And we do have sales from that Web site.

So here are some fun results of the performance of its debut with regular C.P.U and also the impossible, considering a model like Mobileye V2, an average entrance time with an input set of 224 by 224 is two point three milliseconds. So. Considering that you see around 300 frames per second on a video stream. If you guys are Rusnak, you see about 48 milliseconds of latency.

So these are the bench so far, we have benchmarked these models and variants of benchmarking other models as well. So if you take a look at these models, you can find more models of HGP.

So now I can walk you through a couple of. Examples that we that we have developed so far and why you can see why we use the importance of. The said debt, so, as I said, in the privacy of people and enables you to know what's going on. So if you're taking an application, this is a smart city application and you do want to see the mobility of people and see how many people. See the traffic flow you can perform in French and preserve the privacy on device and just get the result of how many people. The same thing here. So analyzing the density of the cafeteria, you need not apply every frame to the cloud to see the results. All you can do is perform in front of the edge and just put a push. Only the results to the cloud. So this a huge bandwidth from gigabytes to fight. Considering of work or are ahead of them, you're saving gigabytes of bandwidth every single day, it's just from one single camera. This is another application to monitoring traffic flow you'll have and this is another application where you'll continue to monitor the traffic flow in this case, the speed is important. So if you're if you're waiting on the cloud bandwidth to get the results from back and forth, there is a lot of latency just because of the cloud. So if you're able to perform it in French, you're able to get around 70 to 80 frames per second speed.

So these are applications so far. But if you are looking to do your own and follow most of these applications.

Here's how you do it.

This is our official companion, Google Corales, so you can visit this page and we do have a number of tutorials and example projects that we have done so far, of which it is very famous. You can detect the pose, get anonymizer users and get their body information, the information detection and the segmentation all at the same time. Now, please feel free to check it out, but if it is a very famous application of ours and recently we have even the list of banana. So this type motivation is the interface quarrell. It's still in the USB accelerator with the Raspberry Pi. This is so feel free to check out the Twitter handle, and this is a great it has great resources, which can be a start point for your prototype, her products, and you can reach out to a team from there if you have any questions or a special request. So so these are the two examples, what the public has been doing. There is one company called Neuroleptic, which recently developed social distancing model using and running the Internet on the edge device. So they developed this application.

And to preserve users privacy, they're performing in front on the edge device and uploading the results to the cloud and domain to make sure everyone is following social standing. And one of our team member has recently developed a face detection model and deployed undercover trucks are pretty well. We get the performance of close to 80 frames per second. So as you have seen a lot of applications and examples, I'm sure you must you must be curious to know how you can compile a model to deploy on HBO, so correlates to use her. Accelerators that support only integer format of the beat, so to do the. You have to train a model tested and then see the model. Use force training, quantization, using deflated converter.

You have to convert it to a full integer quantization and convert it to deflate to inflate model and then using a compiler map the operations to htp you and then deploy to a hardware. Yes, this graph might be a little unclear, so I'll walk you through the next steps slowly.

So after you train your model in intensive care is any other platform, you take the temperature model, which has a very thin flow, 32, unflawed 64, and quantize the model to run on editorial. And considering the fact that coral divides right now only support inferencing, they do not support any training or any other thing. So once the model is frozen and deployed on the coral, you cannot retrain the body. So so the model has to be frozen before it's before going to the quantization step. So here, quantization can be done in two stages. One is post training quantization, which is in most of our customers, the. After the training process is completed, they take this training and calibrate the weight of using a representative data which they have used for training and testing purposes and then pass it to our compiler. So in this case, accuracy is slightly because the quantization and the calibration is hard done. Perfectly so we do recommend another vehicle quantization training, in this case, quantization, quantization are introduced during the training process and which in which they tend to flow relate to the graph of during the training, assigning a fake quantized weight in parallel to the original weight. So this doesn't require any to present to data set because you're performing quantization while training and accuracy is not much affected using if you count as a model using quantization of training. And as I said, quarrell devices are designed only for the inference and the model has to be frozen. And the bad test has to be one footnote.

Once you're done with the quantization, the model, you could use an edge your compiler to compile the model at the edge and deploy. So this is how if you open up its build airplane model in Natron, this is how it looks like nonelection potential is an input sensor. And then all the operations that are mapped, which to you are combined into a single operation, its debut custom up. So this custom runs on it to you all the rest of the input and output, all your operations are taken care of by the CPU. So in this case, this is an object detection model. It has for output not and all the post trussing and preprocessing task is offloaded to C.P.U and distributed years of running on the feature extraction task and detection task. So far, I have been talking to. In other applications, how we use cloud monitoring system and how we use the results from each device pushed into the cloud and we analyze it. But if you if are looking at the total End-To-End cloud infrastructure and using HTP at the edge, this is how it looks like.

You use Google, Google Stories, buckets and store the data, use our time to train your models, it you can either use our thermal vision, our vision or our time vision video it.

So use the services to train your models on your data and then use Iot core to push the train model onto the HTP. And once the Internet is being done and you see the results that are pushed onto the cloud later, depending on your priority, then to be pushed up and you see the results without any user data, without any private user data. So concerning are and. Pipeline using cloud and its deployment here, you see, is save a lot of bandwidth and also provide a lot of. Power or privacy to the users, and this enables us to efficiently use our current data centers to store the required data and also. Efficiently use of power for performing infants, considering the fact that the footprint that's left by running air operations is considerably larger.

The carbon footprint, so using a low power device to register an infant and save power. But Will, it's a lot of useful. So we do have a flood monitoring them all on our database, so they provide you all the tools and the scripts that are required to for each deployment and also cloud interface.

So you push the data to cloud Iot core and then use data flow and be quite able to analyze your results. And of course, you can apply to deploy the to that the friend and the. What's going on, going with that?

So before taking off, I would like to show a quick. I would like to show you a quick. Tour or how we can use how we can deploy it. So here I am using.

Terry.

I'm using a CoLab to to run to compiler institute to compiler model to deflate, and then you first of all, let's install a institue compiler so these instructions can be found on Carinthia, the website. Once a.

And meanwhile, we can talk about the TSA conversion in the privacy of your rights are introduced to air flight control and also to flight handling it. So for this demo, I'm using a command line for this. So I have trained an exporter of object detection model using object detection and have froze the model with no input, 10000 inmates input into.

And all this for as an output enters for an estimate.

So.

Yeah, once I have my frozen graph and if I mention an output image and give provide all the details of input and output and. Specify the intense type as a bit integer, you're all good to go and. So this just takes less than a minute or how we do trying to qualify this support only a bit, anti-terror operations and better support, multiple input, multiple output arrays. Yeah, so the conversion is done and the grass is. Quantized and served as a day of light finds right here. So that and the next step is to compile a flight file using an A compiler.

So after compiling that a model to still using an A.P. compiler, this creates an article with a suffix called Institute of Flight, which also deploys a lot of sharing information of how many number of operations are going to run on the CPU and how many of our patients are going to run on the field. So this is an existing model which is developed and trained for mobile applications. So all the operations are mapped, its debut and only one operation is running on the TV. That means we are using its GPU acceleration power to the maximum and we can get the maximum performance out of it. But if you see half of our operations are on TV, you and half the other operations are running on C.P.U.

There is good amount of latency involved in the IO operations.

Pushing the data back and forth from CPR until this increases the latency. So when you're developing a model for HTP, you, one thing that you have to take is to develop a model A. Which are supported by industry to develop a model with all the operations supported by industry to leverage maximum performance from the distributor. This is how you deploy a model, but once you have once you deploy them, you're going to go into the running in front on the edge to be a.

So.

At last, I do want to introduce to introduce you guys to our website. Coranderrk and.

You know, you can visit our website, Kolata, and take a look at the product available, but his information on the documents of an advocate for border SBX and anything and.

He's a model, compatability shows. These are on the list of operations that we currently have on edge to do any other operations off this list is offloaded. The CBO and CBO will take care of it. But if an model has most of these operations and the entire model can be mapped and you can utilize the extra resources to the maximum. And we do have a.

The pre compiled models, so feel free to take a look at pre compiled models and download. So this is how are their flights on time booked, the entire model is mapped to keep you and edge still the part that runs on. Deflates and time will take care of splitting the model and running it on its own, Sibiu. With this, I'll open up. Thank you so much and I'll open up for questions.

Yeah, so we have one here says.

Let's see what would be the performance if I do body pics directly on using tier flight versus USPI, I think he's talking about the coral USPI board, but performance Delta, would you expect to see using that?

You're sure? So running the pie versus water without, I think, is the net.

Oh, so if you are using a USB accelerator, you definitely required a CPU. It can be a Raspberry Pi or laptop or any hospital. So and the major factor that controls the entrance fee is the model that you chose and the inputs that you choose, because we do have it as a backbone and mobility as a bad one. So if you're choosing a bad phone model and you're using a regular resolution, like for by 640 or 320, 320, you're looking at a real time performance close to 40 frames per second or fifth difference per second. OK, but if you choose a Reznik backbone and a bigger image resolution, say you're looking at maybe 20 cents per second or 15 cents per second.

Ok, so, you know, fairly significant acceleration and on that, a good, decent model, then, yes.

Yeah, we have another question where somebody is asking about privacy concerns just to make sure I kind of had this question, too, in that example, those examples you were showing and you were talking about privacy. You had people walking and you were like checking spacing and things like that. Now, what gets sent? What the actual image of the person never got sent, then up off the device to like a cloud or something like that. Is that was that correct?

Yes. Yes. So the actual image of the person or that image of people who are walking is never off the device. More than just I would put a heat map of the segmentation of human beings, just the same number of people present in the currency at a given time. So the personal data related to people never leaves the device. OK. Same thing happens with the mask detection as well. If you don't have a mask, it says no. But yes, in addition to that.

Yeah, yeah. Because, you know, given the privacy concerns people have, you know, you can I can see what you guys address that, you know, this surveillance society type thing things. And there was another question here is, is there a compatibility for things like Arduino and some of these other boards beyond the you know, for the.

Sure, so come back to the B support, unberth architecture's husband so that everybody is supported, but if you're using a different OS, then I do know that Linux devices are not supported. So make sure you use Linux based operating system and you can build. We have recently open source, Tipu. All the libraries are going to be so you can build the drivers for any kind of hardware platform if you are going to use it.

Yeah, I think I saw that in some of these things where they were talking about.

Just kind of, you know, like edge edge devices in general that you guys had had given that source code, but it became then, I guess, the the job of the implementor to kind of take your source code and get running on a particular platform or something like that.

Yes. So previously we owned supporting other platforms. Now that a library that they would have open so you can dispended for any platform.

Ok, great. Now, would you take contributions from people let's say they had some sort of chip they're working on and they wanted to, you know, move it over to, you know, get your accelerator running on that? Do you guys take in contributions for that?

Yes. So we do encourage that kind of thing.

And we have a community, GitHub repo and also community page coming up. So we take contributions in that way and just publish it on the news.

Could we have a question from a same is saying, what excites you when you work with Korrell?

What sort of use cases kind of excite you about what this device, these devices can do and things like that?

Yeah, sure. Yeah. So the most important thing that that excites me is being able to play at the edge in very important use cases where Internet should not be a bottleneck. For example, if you take five detection use case, so we have a Postnet model, one of our customer used a positive model to analyze if a person is standing, sitting. If you fall down. So especially for a lot of people, detecting a file detection is most important thing. It has to be reported in less than five minutes to save their life. And in that case, if you're messing around with Comcast or the network connections, it's it's going to cost like. So I've seen a plane for many other industrial applications making money, doing something on the edge, but saving lives with it is something that inspired me. And I'm always encouraged by reducing the footprint of fire, because though we see many use cases and I previously worked on a natural language processing application where I learned that they spent around the. This frantic six calls for training or more difficult to its left, the carbon footprint of seven cars lifetime. So, yeah, training models are running in front of the models. It's extensively high consuming competition. So if you're able to save power and use it in trends that look very low power, that that helps, then, yeah.

Yeah. Not everything needs a Artex 2080, I guess, to do justice to its do its work or things like that.

Yes.

Ok, who is your competition. Somebody had asked that. Who do you see as the competition for coral products here.

And maybe and tell us the names kind of pop to mind things.

So I'll answer this question as a non-commercial employee.

Ok. Well, let me get you in trouble here, too. If you don't feel comfortable. We can. We can.

Yeah. So answering this question as a general public first thing of coral is provided by the government research. So the main motivation of the current team itself to get help is to democratize machine learning and help medical applications and industrial applications running a dead. The Internet connectivity is a problem.

So so since it's an research based organization, we are not into the competition market yet. OK, Renaults.

Good and are very low power focused, and so far I don't see any other product in any other than that in the market for coming in to watch and being able to deliver for out of my apartment so you can run as low as to water in your watch without the power consumption.

It's a piece of cake. Not sure. Wow, that's impressive. I have some video boards and I know what what they consume and it's, you know, an order of magnitude better than that. OK, well, I think that's all the questions we had for now. So thank you very much for your time and we really appreciate it and. We'll look forward to hearing more about what they call team was doing, seemed more devices in the future and, you know, getting our hands on some of this really, really cool devices and things like that that you guys are building.

Yes, sir, and I really appreciate giving this opportunity for us and I'm happy to share more about coral and anyone who is listening to me, please feel free to contact us through our website. We are always excited to see new applications and coral and always ready to help.

If somebody was building something with the website below, you know, and they wanted to kind of show it off and get a little PR and things, would that be the proper way to go? Is there a channel for that?

Yes. Yes. So there is something like sense some contact. So if if they have already built something using coral, they can contact us, help from our support channel. OK, I'm going to get back to them and share the success story on our website.

Ok, great. So that's incentive for us to go out and build things with Coral.

Yes. Yes. You'll you'll be shown on the global success stories.

Oh, that can't hurt anybody's credibility. OK. Well, thank you very much for your time.

Yeah. Thank you so much. OK.

27.08.2020