Dunhuang Grottoes Painting Dataset and Benchmark

Challenge Server : https://evalai.cloudcv.org/web/challenges/challenge-page/402/overview

1 Background and Motivation

The Mogao Grottoes, also known as the Thousand Buddha Grottoes or Caves of the Thousand Buddhas, consist 492 temples which spread over 25 km (16 mi) in the area to the southeast of the ancient city Dunhuang, an oasis located at a religious and cultural crossroads on the Silk Road, in Gansu province, China. The grottoes may also be known as the Dunhuang Caves. The grottoes contain more than 10000 full frame painting, which are consecutively created by ancient artists over a thousand years in between the 4th and the 14th centuries. To the present, more than 45,000 square meters’ murals and 2,000-plus painted sculptures are preserved. The murals are of great value for historical, artistic and technological research with the earliest ones dating back to over 1,600 years ago. The Mogao Grottoes is recognized as the United Nations world heritage in 1987.

The mural paintings, however, are suffering from various damage and aging over thousands of years.

Fig. 1. Overview of Dunhuang Grottoes. Left: the Buddha sculpture, Middle: inside of a grotto and Right: Outside view of the Grottoe.

In 1970s, the Dunhuang Academy is established to systematically preserve the heritage. From the study, half of them suffer from corrosion and aging. Because the paintings are created by different artists from 10 centuries, it is non-trivial for manual restoration. And therefore, we release the first Dunhuang Challenge with 600 paintings, which enables an open and public attention in the research community on data driven e-heritage restoration.

This year, the academy is proposing to collaborate with Microsoft Research and other researchers over the world, aiming to solve the automatic restoration of the wall painting using computer vision and machine learning technology.

The Mogao Grottoes, a world cultural heritage, meets all the six United Nations world heritage standards. Over 1,000 years’ continuous construction and expansion that started in AD 366 led to a treasure trove of architecture, more than 45,000 square meters’ murals and 2,000-plus painted sculptures. The murals are of great value for historical, artistic and technological research with the earliest ones dating back to over 1,600 years ago.

In recent years, the Academy starts to preserve it digitally. Manually restoring the painting is not trivial, because the painting is created by thousands and thousands of artists over 10 centuries. holistically master the style is impractical.

Cave 7 of the Mogao Grottoes was excavated in the Mid-Tang Dynasty (AD 766-835), the murals on the north and south walls feature a range of rich content, such as Buddha statues, bodhisattvas, sponsors, architecture, dance, music, and decorative patterns. Based on the digitization of the south and north walls’ murals of Cave 7 of the Mogao Grottoes, 600 images, with resolutions between 500-800 pixels, from different murals were selected for the data set in line with the principle of image content integrity. Out of these 600 images, 500 are stored in the “train” folder as the training data set while the remaining 100 in the “test” folder as the test data set.

2 Dataset Generation

2.1 Generating the clear training and testing set

We use the wall painting on the No.7 Grotto for the data generation. The painting has a balance of the well preserved region and the deteriorated region. Fig. 4 is an overview of the grotto wall painting.

To archaeologist in Dunhuang contributes to divide and slide the huge grotto painting into 600 dataset images. Each of the image is now focusing on a theme such as: Buddha, architecture, decoration, and human. Each of the image is around $500*800$ pixels and the dpi is 75.

Dataset Split The 600 images are randomly split in to 500 images training and validation set and the 100 images for testing.

Later, as described in Sec. 2.2, we provide a method to generate the deteriorated images which best simulates the real deterioration. However, the users are encouraged to generate their own deterioration for training.

Fig. 2. Left: Wall painting damaged from aging; Right: Partially manual restorati

2.2 Generating the deteriorated training and testing set

For users to better understanding the deterioration from aging. We introduce one method along with the deteriorated image on the 500 training set. However, the users are always encouraged to generate there own deteriorated data. The code is not published during the challenge.

In detail, Stimulating deteriorated non-rigid regions in an image involves two stages: 1. random mask generation; and 2. masking image.

Random Mask GenerationThe process of mask generation could be decomposed into following steps:

1) Initialize a square blank image with all value set as 1. This blank image serves as a canvas for drawing mask. The size of initial mask is 256x256.

2) Randomly pick a start point $P_{0}^{{}}$ on the blank image, and set the pixel value to 0.

3) Iteratively perform random walk from $P_{i}^{{}}$ to $P_{i+1}^{{}}$. Once a pixel is traversed, its value will be set to 0. Note that a pixel is allowed to be walked on more than 1 time. The default number of walk steps is 10,000.

Masking Images All groundtruth images in test set are used to make testing samples by two steps: 1) Rescale mask into the groundtruth image size; and 2) Mask corresponding RGB pixels in the groundtruth image with value [0,0,0].

Fig. 3. Overview of the Grotto 7, while the wall painting is well preserved in general,many local area is deteriorated because of the moisture and pest

3 Dataset Usage

3.1 Access the Challenge Dataset

Challenge Dataset can be downloaded from the cloud platform Registration is required.

3.2 Content of The Downloaded Package

You will receive a zip file package containing a few folders:

Train This folder contains the training images. The good images are indexed from 001 to 499. Each good images are associated with two images: one for the binary mask of the deteriorated area and one for the deteriorated image. For example, the image triplet indexed as 001 has three images:

001.jpg.

001 mask.jpg.

001 masked.jpg.

which are the good image, mask of the deteriorated area and the deteriorated image respectively.

The mask and the masked images are provided as a baseline method of imulating the deterioration. It is up to the user to determine whether use it ornot.

The mask and the masked images are provided as a baseline method of simulating the deterioration. It is up to the user to determine whether use it or not.

Fig. 4. A glance of the 600 images collection of the dataset which is generated fromGrotto 7. When generating the collection, we consider a balance of the scenes such as:Buddhas, human, architectures and etc.