On Thursday, we asked Google about its new custom-made chip for artificial intelligence called a Tensor Processing Unit, or TPU.
Google politely declined to answer Recode's questions, saying only that "more information is coming later."
Later in the day, after we posted our questions, Google changed its mind and gave us some short answers to a few — but not all — of our questions, via an email from a spokesperson. Here's what we learned.
1. Is the TPU pre-trained?
This is a big one that Google did not answer. It probably is pre-trained, but we don't know for sure. It comes down to whether the TPU is executing an AI algorithm that has been "trained" on a separate system. To train an AI system to recognize what's in a photograph, you have to show it millions of photographs and let it learn by trial and error. Once the learning is done, you just let the algorithm run on a chip. Examples of "pre-trained" chips include IBM's TrueNorth and Fathom, from startup Movidius. Chances are pretty good the TPU is too, but Google wouldn't say.
2. Can the TPU be reprogrammed if AI algorithms change?
This question had a second sub-question included, which requires a bit of explanation. In the taxonomy of chips, the TPU is considered an ASIC, or Application Specific Integrated Circuit. In English: A chip designed to do one thing extremely efficiently.
The conceit of our question was this: If AI algorithms change, and logically they should over time, then wouldn't you want a chip you can reprogram to accommodate those changes? If that's the case, then a different type of chip, known as a Field Programmable Gate Array, or FPGA, comes to mind. It allows for reprogramming, unlike an ASIC. Microsoft is notably using FPGA chips to enhance some AI functions in its Bing search engine. So naturally we wondered, why not use an FPGA?
Google's answer: FPGAs are much less power efficient than ASICs due to their programmable nature. The TPU has an instruction set, so as TensorFlow programs change or new algorithms are developed they can run on the TPU.
This answer takes a bit of unpacking. Power consumption is an important consideration at Google, whose data centers are both huge and span the world in places as varied as Finland and Taiwan. Higher power consumption raises the cost to operate a data center, and when multiplied many times over it would add up to real money. Google's engineers have weighed the efficiency of FPGAs versus ASICs and decided on the latter.
The second part of the answer refers to the TPU's instruction set. This is basically a list of commands hard-coded on the chip itself that it recognizes and can perform; in the chip world, it's considered fundamental to a computer's operation.
The instruction set on the TPU has been created specifically to run TensorFlow, the open source software library designed for creating AI applications. I think Google is saying that if any changes to the underlying AI are necessary, they'll happen in the software, while the chip is flexible enough to accommodate those changes.
The technical details of the TPU architecture have gotten a lot of people who think about chips wondering. Joshua Ho at Anandtech has an interesting theory that it may more closely resemble a third type of chip known as a Digital Signal Processor. Let the chin-scratching speculation commence.
3. Will TPUs work only with TensorFlow?
Since there are about a half-dozen other software languages for AI applications, it's natural to wonder if the TPU can work with them too.
Google's answer: Other code can run on TPUs, but TPUs are optimized for TensorFlow.
4. Could several TPUs be connected in a system to work together?
Google's answer: Multiple TPUs were used in the AlphaGo match. So they can work together as part of a larger system.
This is a super-interesting answer that triggers SkyNet-like visions of hordes of AI chips out-thinking humans at all sorts of things. The AlphaGo match was the human-versus-computer matchup earlier this year, where Google's DeepMind computer defeated Lee Sedol, the human champion of the board game Go. Sedol later scored one for humanity by winning a game. Still, it's likely cold comfort for Sedol, knowing he was up against many TPUs at once.
5. In the server rack, why is the TPU inserted inside a hard drive?
Google had said that the TPU had been inserted in its server racks next to the hard drive, which seems a little odd. Why wouldn't you want it a little closer to the server's CPU, where all the computing action takes place?
Google's answer: The TPU has a PCIe connector. It doesn’t matter whether a PCIe card is in a motherboard PCIe slot or connected by a short cable. We had room for them in disk slots in our servers, so that was just a convenient place.
6. Where is the memory?
Google's answer: Under the heat sink.
This is just as we thought. If you go back question number one — whether this chip is pre-trained — a trainable chip requires a lot of memory. Judging from Google's photographs, it seems like there's not room for significant amounts of memory under that heatsink. So there doesn't seem to be very much memory on the chip, increasing the likelihood that the TPU is indeed pre-trained.
7. Where is the chip being built?
Google didn't answer this one either. Unless it's hiding a factory somewhere, it would have to hire someone to build the TPU. The leading candidates are the two largest chip foundry companies in the world: Taiwan Semiconductor Manufacturing and GlobalFoundries. Let the rumors persist.
This article originally appeared on Recode.net.