Real time image classification is a complex issue, but given the current state of image recognition technology, this should be very doable. Based on your description, I think you will need 2 layers, one for brand recognition, and another for text recognition. Depending on the types of video streams you are analyzing it could be very intensive for your machine to process, but I will gladly work with you to tune the accuracy/power trade-off.
Just last year I graduated with a master's degree in Computer Science, taking courses in image recognition, AI, and machine learning. Through these courses and other side projects, I have worked with various state-of-the art image recognition and classification libraries. I have also spent the last 3 years working as a proffesional software engineer in the aerospace industry.
Outline:
If you have a language preference, this could be completed in just about any mainstream language, but otherwise I would probably use C++ or C# as those are the language I am most familiar with.
First, I would need more information about how it will interface with the video stream before I can really give more details on that aspect.
The basic structure of the program should be simple, process the video stream, run it through the recognition layers, and dump the output in whatever format you would like. The challenge will be in training the classifiers to work with your specific video stream.
I'd be excited to work with you and go over project details!