Middle Kingdom Life recently received an inquiry from a foreign teacher regarding an online “psychological test” he was asked to take by a prospective employer as a pre-employment screening tool. We had never heard of this test so we scouted around on the Internet until we found the “Evaluation System for Foreign Language Expert.”
The evaluation system consists of five distinct parts: 1) Evaluation of the candidate’s posted résumé; 2) A Psychological Test; 3) Elementary Test; 4) Writing Examination, and; 5) An Online Interview which must be scheduled in advance. Finally, an evaluation report is available to the registered user so that he or she can see the numerical results of the three online exams (presumably with a maximum score of 100 on each subtest).
Before discussing the substance of the three online exams, it is worthwhile to note how the administrators describe the purpose of this evaluation system:*
This system has been developed together with the State Administration of Foreign Experts Affairs, PRC and the Capital University of Economics and Business. It is designed to carry out the recently promulgated Administrative Licensing Law of the People’s Republic of china. The Law is being instituted in order to regulate the issuance of permits for foreign experts and to resolve current problems on introducing oral English teachers into the nation’s public education services. In a long-term perspective, it will accelerate the building the international professional personnel market, foster and standardize international personnel agencies and further strengthen market access, regulation and supervision.
‘This system consists of three subsystems: resume filtration; written testing, and an online interview. The tests will comprise a general personnel evaluation psychological profile and a general knowledge test on China and other current world topics. This system will automatically analyze these tests and generate an evaluation report. The results will help employers evaluate the candidate’s basic information and his/her abilities to become a successful professional worker in China. It will also offer candidates a reliable measuring tool to know their strengths and weaknesses and give a more accurate career orientation.
For the purposes of this article, I registered a user account and then proceeded to take each of the three online tests. Before answering each exam, I created digital snapshots of all the questions for further analysis. As the material is copyrighted, I cannot post the exam material on this site but, as you will soon learn, it is entirely unnecessary for me to do so for the purpose of preparing those who might be asked to participate in the evaluation. That is, based on what you will read, you will very easily be able to score quite high on these exams. And therein lies the first of many problems with this evaluation system.
The manner in which the exams are administered renders them completely useless (assuming the exams themselves were valid, which—as you will learn—they are not at all). Aside from the fact that there is no practical way to assure that the registrant is in fact the exam taker, one can create digital snapshots of all the questions as I did, look up the correct answers on the Internet, and then re-access each exam after a 24-hour wait period.
I started with the psychological test first as this was the one I was most interested in (examinees can take each of the tests in any order). It consists of 100 questions and presumably seeks to measure 23 distinct personal attributes or characteristics:
Upon a cursory examination of the 23 traits this test purports to measure, one need not be a Western psychologist to accurately surmise that this is neither a standardized test nor one that has any precedence of use. Many of the questions are loosely based on items that can be found in classic personality measures such as the Myers-Briggs Personality Inventory, but the items are seemingly and deliberately reworded, and quite poorly at that, to avoid copyright infringement I presume.
For at least four or five questions, I had to guess at what was intended and that alone invalidates the test, i.e., a test cannot accurately measure what it purports to if the meaning of the questions is unclear.
It was obvious to me that the instrument was written to measure what the SAFEA believes would constitute a good foreign teacher: someone who is resilient, cooperative, takes initiative while at the same time is able to go with the flow and adhere to the will and direction of the majority, is a non-smoker as well as a non-drinker. The test suffers enormously from demand-bias, that is, it is extremely easy to ascertain which answer is the “correct” one (especially after reading this article). I made the mistake of second-guessing the expectation of the exam’s authors in regard to cigarette smoking: Surely, I thought to myself, in China, cigarette smoking must be considered a desirable trait? Because I deliberately padded my responses to the four questions on cigarette smoking to indicate moderation (as opposed to strongly rejecting the behavior), I scored a 93 on the test instead of what I assume would have been 100 if I had strongly endorsed all negative statements about smoking.
The answers to each question are measured along what is intended to be a standard 5-point Likert Scale, which was anything but conventional: Highly Accept—Accept—Moderate—Not Accept—Refuse. I personally found the last category of response to be very confusing as I wasn’t initially sure if “refuse” meant that I was refusing to answer the question or that I was indicating strong disagreement (based on my final score, it is fair to assume that the latter response was intended). Why the authors of the exam decided to use such unconventional labels for recording responses when years of research have validated the accuracy of simply using “Strongly Agree, Agree, Undecided (or ‘neither agree nor disagree’), Disagree, and Strongly Disagree” was completely lost on me.
Finally, many (if not most) of the questions are poorly translated into English such that there is one typo (“wist” instead of “fist”), idiosyncratic word usage, and several unclear meanings. I consider this test to be neither valid nor reliable in regard to what it claims to be measuring (much like the CET and TEM exams, see below). The exam is timed at 20 minutes, presumably to force the examinee to answer as spontaneously as possible without the opportunity to give each question much thought.
The Psychological Test shined in comparison to this absurd 40-question exam on what was supposedly intended to be a test of general knowledge. Most of the questions were no measure of common knowledge at all, but of trivial facts about China that maybe a few highly experienced veterans of China might have picked up along the way, e.g., for those who have been living in China for at least three years: Do you know the precise month that every Chinese holiday is celebrated in (including all the minor ones that are not considered national holidays)? Why any prospective foreign teacher should be reasonably expected to know the answers to these types of questions, as well as what I counted as 34 other nonsensically trivial questions, is a complete mystery to me. Six questions, of the 40 total, could be argued to measure the type of general knowledge that a well-educated Westerner would or should know (and one of those questions is asked twice—so it’s only five unique questions in total that are valid). At least one of the questions contained no correct answer due to an apparent typo.
This is the only one of the three tests that possesses any face validity as it is based on the types of questions typically found on the writing part of English tests for non-native speakers (e.g., TOEFL, IELTS), i.e., it asks the examinee to write a 200-word essay (about two short paragraphs) expressing an opinion on a common topic in which two points of view are proffered. The user has 10 minutes with which to formulate his position and finish the essay. As my writing sample has not—as of yet—been scored, I can’t address how subjective or not the scoring might be. My approach was to give validity to each side of the debate (although, in this particular case, that was an accurate reflection of how I felt). I am curious to see how I scored on the writing test.
As I sat through and then later pondered this “Evaluation System for Foreign Language Expert,” I was reminded of the time a few years ago when the building management—in response to numerous complaints I had made everyday over a two-week period regarding the rather serious plumbing problem in my apartment—sent a “plumber” who was equipped with a Phillips screwdriver in one hand, and a roll of duct tape and a pack of cigarettes in the other. Obviously, he wasn’t the least bit prepared to solve the problem he had been assigned to address—assuming he even had the knowledge.
I commend the SAFEA on its explicit reason for creating such an evaluation system, namely “…to regulate the issuance of permits for foreign experts and to resolve current problems on introducing oral English teachers into the nation’s public education services.” Something like this is long overdue: Unfortunately, this is not the proper way to achieve those goals, not by a long shot.
Pre-employment screening has its place in the world: It is intended to establish a “goodness of fit” between the needs of any given job and the temperament and ability of the various candidates who are seeking the position. Predetermining a good fit between the needs of the job and the capabilities and strengths of an applicant has many benefits, such as a significant reduction in expenses associated with training costs and employee recidivism.
My question to the SAFEA is this: If you are serious about achieving your explicit goals, why not establish a licensing agreement with one of the dozens of time-tested batteries of pre-employment tests that are currently in use? Virtually anyone of the dozens of pre-employment tests, aptitude or vocational interest tests, such as the Myers-Briggs Type Indicator (MMTI) combined with the Strong Vocational Interest Inventory (SVII), would make for a much more valid and reliable pre-employment screening measure. In addition, and completely aside from the absurdity of the instruments you are using, developing a system of administration that can so easily be defeated suggests to me that this is something that was “thrown up at the last minute” to meet a requirement in which there is absolutely no genuine interest in achieving the adopted law’s intent. There are several reputable online computerized testing systems with branches all over the Western world through which your pre-employment battery of tests could be administered to prospective foreign teachers, such as Prometric to name but one. The fact that Microsoft, for example, exercises more integrity and caution in how they certify their systems engineers than China does in certifying its foreign English teachers should be a source of considerable internal concern and reflection.
I would also suggest that there is a much simpler and straightforward manner in which you can achieve your explicit goals: Simply have each province and municipality adopt and enforce the SAFEA guidelines regarding minimum requirements that have already been in place for years. There are numerous Internet services available for checking the academic credentials, as well as the work history and criminal backgrounds of job applicants from all over the world. All you need is an international credit card. The Bank of China issues one as does the Merchant Bank of China, among a few others.
In the end, does it really matter if a foreigner knows on what specific day Chinese Valentine’s Day is celebrated? Does anyone in the SAFEA truly believe that knowledge of such a trivial fact predicts who will be successful as a foreign oral English teacher in China?
This time around, let’s send in real plumbers to fix the plumbing.
*This is an exact replica of the text contained under the menu item “About this System.”