Guidelines
Ⅰ System description
1. The 43-hour Sinica Taiwan Mandarin Conversational Corpus (TMC Corpus) consists of 30 free conversations between strangers (MCDC8 and MCDC22) and 29 topic-specific and 26 Map Task conversations (MTCC and MMTC) between people acquainted with each other, each with an average length of one hour, 20 minutes, and 10 minutes, respectively. The TMC Corpus has a balanced scenario design and conversation partner familiarity. Ninety-eight female and 72 male speakers aged 16 to 63 were recorded. Twenty-six speakers took part in all three sub-corpus projects. Conversations were recorded in quiet rooms in Academia Sinica using the SONY TCD-D10 Pro II DAT digital recorder and the Audio-Technica ATM 33a microphone at a sampling rate of 48 kHz, with each speaker on a separate channel. The speech content was orthographically transcribed using traditional Chinese characters. Particles, discourse markers, fillers, word fragments, and paralinguistic sounds that often occur in Chinese conversation are annotated in the transcripts. Only MCDC8 has been manually checked for Pinyin and POS. The rest of the corpora provided in this system is automatically processed, so please use it with caution.
2. Our spoken resource projects concern continuous speech, including adult conversational speech, adult interview speech, child repetitive speech, and child narrative speech. Long speech stretches are segmented into interpause units (IPUs) according to disjuncture cues such as pauses and paralinguistic sounds (e.g., breathing, inhalation, and laughter).
3. FILLER_FEEDBACK, MARKER, PARTICLE_M, and PARTICLE_S are written in Romanized letters in the corpus of this system, which can be searched through the POS menu. FILLER_FEEDBACK expresses the sounds made by the speaker when responding to a conversation partner. MARKER indicates a discourse's use of the demonstratives "NA" and "ZHE". PARTICLE_M and PARTICLE_S respectively mark discourse particles in Mandarin and Southern Min.
4. When transcribing conversation contents, some sounds or words that cannot be clearly recognized are marked as being "uncertain".
5. When the speaker speaks in Southern Min, the content is marked as SOUTHERN_MIN.
Ⅱ Keyword input
1. When using Pinyin input, you must enter the complete Pinyin and tones to correctly find the corresponding Chinese character; if you search for multiple Chinese characters, you must enter the complete tones to display the results.
2. For typing in blanks in Pinyin search, a half-width blank should be entered, or a half-width underlined "_" should be entered.
Example:
◇ Enter "zhen" to search, and all items containing "zhen" will be searched, including "zhen1", "zhen3", "zheng3", etc.
◇ If you want to enter Pinyin to search "然後", the keyword should be entered as "ran2_hou4" or "ran2 hou4".
Ⅲ Search scope settings
After entering the corpus search page, click the "Select Corpus" button on the left side of the page. There are three corpora in this search system:
一、Sinica Taiwan Mandarin Conversational Corpus
二、Sinica Sociophonetic Corpus
三、Sinica Child Speech Corpus
You can only choose ONE main corpus, but multiple sub-corpora. After selecting, click "OK" to complete the setting.

Ⅳ Search "Sinica Taiwan Mandarin Conversational Corpus"
一、Select corpus:
Click "Select Corpus" on the left, and select "Sinica Taiwan Mandarin Conversational Corpus", and then open the drop-down menu to check the sub-corpus.
二、Set corpus search conditions
1. Set the main keywords. Enter the main keywords you want to search in the first search field (you can enter Chinese characters or Pinyin), and then select the part of speech from the drop-down menu.
2. Set secondary conditions
(1) Choose a search mode
a. IPU-internal (single speaker): Secondary keywords are limited to the same IPU as the primary keyword.
b. Within "n" characters (single speaker): "n" is a value between 1 and 20. The search rule applies only to the same speaker, within n characters adjacent to the left or right sides of the main keyword. It is not limited to the same IPU.
c. Within "n" characters (different speakers): "n" is a value between 1 and 20. Results found from the speaker who produced the main keyword are excluded. Only results from different speakers are displayed. It is within n words adjacent to the left or right sides of the main keyword. It is not limited to the same IPU.
(2) Set search conditions for secondary keywords
a. "AND": Search results must contain both primary and secondary keywords.
b. "OR": Any keywords that match the primary or secondary keywords will be searched.
c. "NOT": If a secondary keyword appears in the search results for primary keywords, this item is not displayed.
(3) You can click "+" or "-" to add or delete secondary keyword condition setting fields.Please also note that when two or more secondary keyword conditions are set, only the relationship between each condition and the primary keyword is considered, regardless of the settings order.
三、Search results
(1) Search results are presented in their original interactive dialogue format.
(2) The search keywords are highlighted for viewing. However, characters/words that match the search keywords but do not meet the search conditions may also be highlighted. Please pay attention.
(3) When searching for the same speaker, due to misjudgment, keywords from different speakers may also be highlighted. Please pay close attention to the results when using them for research.
(4) On the search results page, click the " FileID" column to view more content.
(5) In the upper right corner of the search results page, click "View" to enter the view window to view the context in which the keyword appears. Click "Export" to download the search results (the first 150 items ONLY in the current version).
(6) View:
<1> Click "View" to view the search results for the keywords. The search results for the primary keyword are displayed. Search results are sorted by secondary keywords.
<2> "View" sorts results by adjacent words to the left/right of each keyword. The search results for the primary keyword are displayed. Search results are sorted by secondary keywords.
Ⅴ Search "Sinica Sociophonetic Corpus"
一、Select corpus:
1. Click "Select Corpus" on the left, and select "Sinica Sociophonetic Corpus ", and then open the drop-down menu to check the sub-corpus.
二、Set corpus search conditions
1. Set the main keywords. Enter the main keywords you want to search in the first search field (you can enter Chinese characters or Pinyin).
2. Set secondary conditions.
(1) Choose a search mode
a. IPU-internal (single speaker): Secondary keywords are limited to the same IPU as the primary keyword.
b. Within "n" characters (single speaker): "n" is a value between 1 and 20. The search rule applies only to the same speaker, within n characters adjacent to the left or right sides of the main keyword. It is not limited to the same IPU.
(2) Set search conditions for secondary keywords
a. "AND": Search results must contain both primary and secondary keywords.
b. "OR": Any keywords that match the primary or secondary keywords will be searched.
c. "NOT": If a secondary keyword appears in the search results for primary keywords, this item is not displayed.
(3) You can click "+" or "-" to add or delete secondary keyword condition setting fields.Please also note that when two or more secondary keyword conditions are set, only the relationship between each condition and the primary keyword is considered, regardless of the settings order.
三、Search results
(1) The search keywords are highlighted for viewing. However, characters/words that match the search keywords but do not meet the search conditions may also be highlighted. Please pay attention.
(2) On the search results page, click the " FileID" column to view more content.
(3) In the upper right corner of the search results page, click "View" to enter the view window to view the context in which the keyword appears. Click "Export" to download the search results (the first 150 items ONLY in the current version).
(4) View:
<1> Click "View" to view the search results for the keywords. The search results for the primary keyword are displayed. Search results are sorted by secondary keywords.
<2> "View" sorts results by adjacent words to the left/right of each keyword. The search results for the primary keyword are displayed. Search results are sorted by secondary keywords.
Ⅵ Search "Sinica Child Speech Corpus"
一、Select corpus:
1. Click "Select Corpus" on the left, and select " Sinica Child Speech Corpus ", and then open the drop-down menu to check the sub-corpus.
二、Set corpus search conditions
1. Set the main keywords. Enter the main keywords you want to search in the first search field (you can enter Chinese characters or Pinyin).
2. Set secondary conditions
(1) Choose a search mode
a. IPU-internal (single speaker): Secondary keywords are limited to the same IPU as the primary keyword.
b. Within "n" characters (single speaker): "n" is a value between 1 and 20. The search rule applies only to the same speaker, within n characters adjacent to the left or right sides of the main keyword. It is not limited to the same IPU.
(2) Set search conditions for secondary keywords
a. "AND": Search results must contain both primary and secondary keywords.
b. "OR": Any keywords that match the primary or secondary keywords will be searched.
c. "NOT": If a secondary keyword appears in the search results for primary keywords, this item is not displayed.
(3) You can click "+" or "-" to add or delete secondary keyword condition setting fields.Please also note that when two or more secondary keyword conditions are set, only the relationship between each condition and the primary keyword is considered, regardless of the settings order.
三、Search results
(1) The search keywords are highlighted for viewing. However, characters/words that match the search keywords but do not meet the search conditions may also be highlighted. Please pay attention.
(2) On the search results page, click the " FileID" column to view more content.
(3) In the upper right corner of the search results page, click "View" to enter the view window to view the context in which the keyword appears. Click "Export" to download the search results (the first 150 items ONLY in the current version).
(4) View:
<1> Click "View" to view the search results for the keywords. The search results for the primary keyword are displayed. Search results are sorted by secondary keywords.
<2> "View" sorts results by adjacent words to the left/right of each keyword. The search results for the primary keyword are displayed. Search results are sorted by secondary keywords.
1. Account application: Download and fill out the application form.
2. The application form should be completely filled out and stamped with the official seal (by University, company, etc.).
3. Upload the application file. (.pdf)You can click "Preview File" to confirm whether the upload is successful.
4. Click "Confirm Application" to complete the account application.
5. If the application is approved, the username and password will be sent to your mailbox and you can start using the system.
6. The account is valid for one year. After the account expires, you need to apply again.