Contact Project Developer Ashish D. Tiwari [astiwz@gmail.com]
Download Synopsis Abstract
Java Data Mining BE-Engineering(CO/IT) BCS MCS BCA MCM BSC Computer/IT MSC Computer/IT IEEE-2015

Hadoop Recognition of Biomedical Named Entity Using Conditional Random Fields

As one of the most recognized models, the conditional random fields (CRF) model has been widely applied in biomedical named entity recognition (Bio-NER).
Abstract-Synopsis-Documentation

Hadoop Recognition of Biomedical Named Entity Using Conditional Random Fields

Abstract—Processing large volumes of data has presented a challenging issue, particularly in data-redundant systems. As one of the most recognized models, the conditional random fields (CRF) model has been widely applied in biomedical named entity recognition (Bio-NER). Due to the internally sequential feature, performance improvement of the CRF model is nontrivial, which requires new parallelized solutions. By combining and parallelizing the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) and Viterbi algorithms, we propose a parallel CRF algorithm called MapReduce CRF (MRCRF) in this paper, which contains two parallel sub-algorithms to handle two time-consuming steps of the CRF model. The MapReduce L-BFGS (MRLB) algorithm leverages the MapReduce framework to enhance the capability of estimating parameters. Furthermore, the MapReduce Viterbi (MRVtb) algorithm infers the most likely state sequence by extending the Viterbi algorithm with another MapReduce job. Experimental results show that the MRCRF algorithm outperforms other competing methods by exhibiting significant performance improvement in terms of time efficiency as well as preserving a guaranteed level of correctness.

Comment is Only Available for registered users! Create Account or Login Now!