Automatic Speech Recognition and Machine Translation

Our work of data collection is related to the development of speech recognition and machine translation technologies. We are able to collect a large scale of speech data and text data from all over the world for the service contract. At the same time we sell multilingual speech corpus and parallel corpus for academic purposes as well as development of ASR/MT systems.   


Professional Engineers and our staffs all over the world

Our company was established in 2016 by engineers and staff who have been working with Microsoft, Google and japanese electronic companies in voice recognition and Machine Translation technologies departments for over 30 years. Despite being only 2 years old, Timeill Inc. is a co-operative team whose members have been working in the speech data collection field since 1980.


Our Advantages

We possess and are able to collect a large TTS (Text to Speech) database, but our main strengh is our spontaneous speech data collection. We have the largest data base in this field in the world, and this data has already been used for acadamic purposes by Tokyo University and the development of ASR/MT system released by top IT company from all over the world.  .


Yoichi Tokioka, CEO and founder of Timehill, Inc.

He is known for designing effective and innovative research and data collecting methods of text, speech, image, and vide, for the development of various consumer goods such as mobile app, PC software, foods and beverages with deep consumer knowledge and solution-driven insights. He graduated from Waseda University with a Master’s Degree in Philosophy and issued valuable scientific articles and books with Professor K. Hirose of Tokyo University. And he used to be a member of MT evaluation committee of AAMT(Asia-Pacific Association for machine translation - He also has a career as a Pro soccer agent for the premier League in England. He is based in Kyoto, Japan, but goes around the world for research and business. People say he has more flight miles than any airline pilots!


Work With Us:

We would like to sell you our speech and pararell text corpus to develop speech recognition and machine translation technologies. We are available to come in person to your office if necessary. Please email us a request for information and we will be in touch within a week.


■Speech data collection■

For the development and evaluation of ASR system, large scale and worldwide speech data collection service by Timehill Inc. is very helpful!.



Worldwide and large scale

•More than 50 countries and over 200 cities available:

North, Central and South America, Scandinavia, the Baltic countries and other european countries, Africa, Middle East, Asia and Oceania.


Quality is “BETTER”

•Professional voice recording by our engineers and staff

•Speaker’s specificities: 50% men and 50% women, from 3 to 100 years old.

•Rigorous evaluation on the participant’s native language voice (speaker)

•We never use the same participant (speaker) twice

•Tight control of sample preparation and product serving specifications

•Drop-out rate is less than 5%


Speed is “FASTER”

•Virtually "Anytime and Anywhere”

•Unbeatable project turnaround

•Real-time results with 24-hour data delivery

<For example>

>10 minutes per speaker : 200 speakers recording per day

>20-30 minutes per speaker : 20 speakers recording per day

>In total : between 100 and 1 000 speakers recording per week


Cost is “CHEAPER"

•Low sample and recruiting costs

•No costly building leases in numerous markets


Speech style:

•Spontaneous speech (we are able to create natural spontaneous speeches)

•TTS ~Text to Speech (100% correct)


Recording device:

•Stand microphone

•Headmounted microphone

•Pin microphone

•Boundary microphone

•Smartphone (iPhone, Android)


Recording room:

•Professional narration booth studio

•Quiet meeting room

•Outdoor field


Sampling rates and recording channels:

•8 kHz, 16 bit

•16 kHz, 16 bit

•22 kHw, 16 bit 

•44 kHw, 16 bit 

•48 kHw, 16 bit


Recording style:

•Scripted speech (TTS)

•Emotional speech

•Whispered speech

•Spontaneous speech

•Bilingual speech (1 person, 2 languages)

•Lecture speech

•Dialogue speech (2 persons)

•Conversational speech (at least 3 persons)



Quality and price: the price depends on the customer’s order and we can recollect the speech data is there is any error, this without any extra charge

 The transcription is in extra