@Dominik actually there is practically no difference to the parts that process models in a TPU to NPU.
As currently the defacto processing unit is Tensorcore the choice of interface has much to do if the model or current training step is loaded but both are batch to unit ram processes.
The USB sticks in comparison due to cost have a smaller compatibillity subset and register types mainly due to cost.
Google cloud TPUs are NPU based not GPU if you can accept the approximation of terminology as its the tensor cores of a GPU without the GPU stuff it doesn’t need.
Yes, there is a lot of marketing speech involved here. To my understanding the Rockchip-NPU is FPGA and supports “reprogramming” while Coral Edge-TPU is “hardcoded” ASIC.
I think the rockchip is somewhere between the 2.
You can hire a Cloud TPU v2 for $4.50 / TPU hour $1.35 / TPU hour on preempt (ie kicked if someone wants to pay $4.50 until spare)
Which is 11.5 petaflops and for training it doesn’t seem to make sense to purchase what you can get for hardware.
Most common voice models run happily at > relatime on a cpu but training can be an almighty endurance chore.
But you can hire server space and have it returned in minutes what will take hours with even some condiserable hardware.
But the little sticks are quite impressive seen as a RTX2080 is about 12Tflops and the vision models are pretty excellent but thankfully voice streams are much slower than visual ones.
I think you have to delegate to libedgetpu.so.1 otherwise you are just running normal.
But as I said above “No as they are all very similar with different compatibility issues that you will have to research yourself.”
Don’t think many if any are using them here and likely to find better answers elsewhere.
Add the delegate when constructing the Interpreter .
For example, your TensorFlow Lite code will ordinarily have a line like this:
The file passed to load_delegate() is the Edge TPU runtime library, and you installed it when you first set up your device. The filename you must use here depends on your host operating system, as follows:
I have ran a CNN model on Google Coral TPU Dev Board, but I am not sure it does accelerate the model… Is there a command to use in order to see specifically the ASIC load? And more important the time duration that tensorflow lite accelerates the CNN model?
I was wondering if there is open source to blur faces, license blades and company labeling on vehicles on-the-fly in video streams?
This would make the stream GDPR compliant as long as not recording private property.
Ideal would be if an additional feature would allow to black out private property if a camera is fixed mounted at one position.
If there is such software, how performant would the Coral EDGE TPU be? 1 or more streams? What resolution and frame rate?
A single Coral can handle many cameras and will be sufficient for the majority of users. You can calculate the maximum performance of your Coral based on the inference speed reported by Frigate. With an inference speed of 10, your Coral will top out at 1000/10=100 , or 100 frames per second. If your detection fps is regularly getting close to that, you should first consider tuning motion masks. If those are already properly configured, a second Coral may be needed.
A pi though comes at the bottom of the list as the base board.
A Coral m.2 is half the price of with maybe the dual e key one being best value even though have never tried.
GDPR is more about using, informing and retention of data and stoping the misuse of that data and sharing.
If you comply correctly then thoses faces, licence blades and company labeling are not a problem, but yeah there are ML segmentation models to capture those so likely the same could be used to obscure.
If you do some googling and find the perfect int8 model for edge ml then performance is extremely impressive.