21 February 2012

Natural Language Processing Enhances EHR Use

www.ZyDoc.com Presentation Outline: - Structured Data (CPT-4, ICD-9, ICD-10, RxNORM, SNOMED-CT, LOINC) is required for data mining and interoperability ...

Thank you. I am Dr. James Maisel. I am with ZyDoc. We work in a number of areas

and I m going to specifically concentrate on natural language processing and computer assisted coding and how we take natural language processing and apply it to the day to day work of the radiologist. One of the definite directions that medicine is headed is towards structured data and radiologists are used to dictating and having no structured data, so we ve got to have some paradigm change here. CPT-4, ICD-9, ICD-10, RxNorm, SNOMED CT and LOINC are the emerging standards for data. We want data in that structure so that we can do data mining for outcomes, utilization, quality assurance and answer the questions for research, and economics questions also. This data structure will be required. It is mandated for EHR, ICD-10 for October 2013, and SNOMED CT will be the clinical terminology that is required in the EHR, and the reason why these are mandated and desirable is for interoperability so we can exchange data between diverse systems, be they EHRs or PACS or RIS systems, and be able to do this type of analysis there. So, interoperability actually is driving this and we need to take that coded data and move it between diverse systems. Once the data is coded, it is very easy to move it between RIS, and PACS, and EHR and HIEs, in standard architectural formats like CDA, CCR or CCD documents or with HL-7, and you can take coded data and move it into billing and also analyze it very efficiently from the relational database. So what is the problem with structured reporting? Well, how many people use structured reporting now? Okay, and does anybody find it more efficient than dictating? Probably only if you have a normal exam. If you have any pathology in the exam, you have to click through a lot of fields. So, structured reporting is a definite change in work habits. Radiologists are used to looking at images and dictating while they are looking at the images. Entering structured data in a computer program requires you to look away from the image, and it requires you to navigate through menus and it takes a lot of time to enter structured data because you have to go through a lot of menus. And if you think you have a lot of dropdown choices now, in October [2013], when your ankle fracture will go to over 200 codes in ICD-10, you are paging through a number of specific choices. So you re going to have a choice between spending a short amount of time dictating or a long amount of time entering data, the compromise is you enter less data so we have limited amounts of structured data. So entering the data is very inefficient in structured format, whereas dictation is much more efficient. So radiologists are very advanced dictators

we work in the transcription industry and radiologists go over a 140 wpm, quite often you have to slow it down to actually understand and type it. You can generate a 419-word document report of a PET scan in 3 minutes very efficiently, and the doctors that do this day in and day out are familiar with the format and they know what to include and not include. Direct entry into an EHR or a structured reporting system in a PAC system might take 10 minutes for a case with pathology if you are an interventional radiologist. Good luck. So why can t you dictate directly into the system, into your EHR or PAC systems with speech recognition? Why certainly you could, but the problem with that is it generally excludes text which is not structured and you can t data mine it. So you are left with a dilemma, do you work efficiently or do you capture structured data requiring extra time? So we chose to look at using natural language processing as a solution here to give you the best of both worlds. You capture the documentation efficiently using dictation, and then you take the free text and you convert it into structured data using natural language processing. We can generate SNOMED CT, LOINC, RxNorm drug codes, ICD-10 and certainly ICD-9 and CPT-4 for billing, out of one dictation. So what does natural language processing do and how does it work and what does a document look like? Well, on the left would be a typical dictation of a physician. The yellow terms are the ICD-10 coded terms that are extracted and put in a database. I am just showing you this in a simple application, so you can see. So these codes on the right are actually the ICD-10 codes that are extracted. These are the headings for a physical examination, assessment and plans that allow insertion into an electronic record or other system. When we go to SNOMED coding, SNOMED is the clinical terminology, the yellow terms are the SNOMED terms, and the blue terms are the modifiers. Pertinent negatives are also picked up too. With So you can see for a different descriptor, that you may have dementia and have three qualifiers of if. You can have up to about 28 modifiers in a SNOMED concept. SNOMED is beautiful because it gives you a hierarchical structure, so you can pull out all brain tumors or a specific type of brain tumor and drill down to very narrow specific pathologies to pull out data later on from the database. So natural language processing is very efficient. We are seeing that we can pull out codes about second and a half of dictation generating the code. Now imagine if you have a structured template where you have to go down and click there every second and a half to generate codes. It is just amazing how many codes are pulled

out of a transcript, out of text. You can also take other forms of medical information and feed it into these systems from EHRs that are semi-structured, or legacy documents, and feed them in in virtually any standard input, preprocess them and feed them into the NLP engine. So what I am showing is an efficient solution that allows you to continue to work as you do with no change in habits, with dictation, and whether you use speech recognition on the front end, or transcription, or legacy documents, feed those into the NLP engine and then output them into structured format along with the narrative that is really what the physicians want to read to see what is going on. So the workflow involves the physician dictating -- on the top left in 1 -- and this could be with live speech recognition, back end speech recognition or conventional transcription, nobody really cares. You are taking this text document and then you are submitting it to the NLP engine for processing, and this technology then forms the data structure within a database. It has the text corresponding to the codes that are extracted in XML format. We have done this in Microsoft SQL 2008 and you could subject that to queries for research, outcomes, serious adverse effects, or you can pull out messages, CDA messages and send them to an EHR or to automated coding for the billing, all of with just dictation. Thank you. Just a quick question from me. This is very good. The one observation that I have had when we have been dealing with NLP, natural language processing, is that it is an imperfect science. We are making it better every day and you ve alluded to voice recognition. I think that is a positive step and it is a step in the right direction in the sense that you know we can, instead of just typing with your tongue, which is what we do with VR, there is a possibility, a real possibility of actually capturing documents not just in a flat text file, but it in a CDA Level 2 format, which I think you alluded to as well. So, are we really mandating VR now not just for radiology, but really document capture across all of the EHR environment? Right, well we are seeing that dictation is actually the most efficient way to get information captured, and generally the text is what you want. Any electronic document can be coded. So in other medical specialties, it is actually a much bigger problem because they are dealing with bigger universe of terms and methodologies and histories and categories than radiologists in particular. So the problem is much bigger in medicine. We deal with inpatient and ambulatory of all specialties. So in radiology we can really do a good job because the perplexity

is lower. So what NLP engine did you use and what is the precision and recall of that NLP engine? We are under non-disclosure at this time about our NLP engine, but I can say that we are working with a major teaching university. We were chosen to commercialize their technology and we have a number of patent-pending processes and technologies surrounding that in pre- and post-processing. So the accuracy is not 100% but we are now at the point where we can generate about 80% accuracy across all specialties and we are now able to do the measurement on the error rate, which are largely missed things rather than false positives and correct those now. So we think we will very quickly be able to improve that. Have you incorporated Radlex at all because a lot of the radiology terminology that is used is not encompassed in any of the standard terminologies, which is why Radlex was created. Right, we have not incorporated Radlex, but I have inquired and we should be able to incorporate it, translating it to SNOMED, or UMLS or one of the other vocabularies that we are supplied with [crossmap translation]. Is that correct? Yes. So we should be able to incorporate that, but I don t know of any requests for Radlex systems that need NLP, if you are , we can make it. But you could add that terminology? Yes, absolutely. If there s a need for it, we will do it. Thank you. [Content_Types].xml _rels/.rels theme/theme/themeManager.xml sQ}# theme/theme/theme1.xml +PHI| :>vn 0F[, ~{s:FXI (}\- QyI@ ms]_ @c])h 9M4W= Ch," V%W/7 k>\lc` theme/theme/_rels/themeManager.xml.rels 6?$Q K(M&$R(.1 [Content_Types].xmlPK _rels/.relsPK theme/theme/themeManager.xmlPK theme/theme/theme1.xmlPK theme/theme/_rels/themeManager.xml.relsPK <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <a:clrMap xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" bg1="lt1" tx1="dk1" bg2="lt2" tx2="dk2" accent1="accent1" accent2="accent2" accent3="accent3" accent4="accent4" accent5="accent5" accent6="accent6" hlink="hlink" folHlink="folHlink"/> Joe Doctor, M.D. Linda FREEHOLD TEMP 8-09 Ashish Microsoft Office Word ZEBRYK ENGINEERING Joe Doctor, M.D. Title Microsoft Word 97-2003 Document MSWordDoc Word.Document.8