Capturing structural similarity has been a hot topic in the field of network embedding (NE) recently due to its great help in understanding node functions and behaviors. However, existing works have paid very much attention to learning structures on homogeneous networks, while the related study on heterogeneous networks is still void. In this article, we try to take the first step for representation learning on heterostructures, which is very challenging due to their highly diverse combinations of node types and underlying structures. To effectively distinguish diverse heterostructures, we first propose a theoretically guaranteed technique called heterogeneous anonymous walk (HAW) and give two more applicable variants. Then, we devise the HAW embedding (HAWE) and its variants in a data-driven manner to circumvent using an extremely large number of possible walks and train embeddings by predicting occurring walks in the neighborhood of each node. Finally, we design and apply extensive and illustrative experiments on synthetic and real-world networks to build a benchmark on heterostructure learning and evaluate the effectiveness of our methods. The results demonstrate our methods achieve outstanding performance compared with both homogeneous and heterogeneous classic methods and can be applied on large-scale networks.