Development of a Specialized Search Tool for Microbiology
Table of Contents
- Requirements Analysisand Discussion
This chapter is essential to achieving the goal, which is the development of a specialized search tool for microbiology. Major emphasis is laid on performing a thorough analysis of the best set of requirements that will meet the targeted functional characteristics of a quality microbiology search engine (Chauhan et al. 2013).
The user requirements have to be achieved by the search engine provider who also has requirements. The research is to investigation what the users and provider require. When these are understood in detail a system can be developed that meets those requirements as well as possible.
This research describes the search for and assembly of the user and provider requirements and the development of a search engine tool for microbiology. The study aims to assess the importance of search engine techniques applied as E-learning tools in Biology and the advantages of a specific search engine technique. It is imperative to understand the potential ways of E-learning that may be adapted to support microbiology researchers and students. Google’s search engine capabilities return specific microbiology results to the users. Google is a widely used search engine which has an enormous database of content. Furthermore, the tool shall ensure efficiency as it attempts to provide most relevant results for Microbiology researchers. In addition, it shall make use of machine learning techniques and data mining concepts to ensure efficiency of the document classification.
Repercussions of the study for those concerned with implementation of Microbiology search engines in Saudi Arabia will be analyzed. Moreover, the study aims to investigate different ways of using E-learning to boost the academic achievements of students and researchers. Other aims include evaluating the relevance of such technology both in the United Kingdom and Saudi Arabia as well as acceptance of E-learning technology for Microbiology for academics in KSA. This will be a quantitative study supported by use of surveys to facilitate data collection from a selected population.
User requirements entail a clearly stipulated set of activities or functional procedures that users are expected to implement in their day-to-day interaction with a system. Among the most essential user requirements for microbiology researchers are functions such as the ability to perform very a specificsearch through query expansion to overcome the width of retrieval responses arising from microbiology finding terms used in other areas leading to ambiguities and multiple meanings, such as genetic algorithms and part words such as gene occurring in everyday words such as general. That is, achievement of the final goal of specific relevant retrieval being obtained by applying concepts such as domain Ontology Semantic search methods.(Chauhan et al. 2013).
Domain ontology search is a fundamental functional requirement for users of the targeted microbiology search engine. This is because ,it facilitates efficient retrieval of the required information through a feature called automatic expansion of a query request (Chauhan et al. 2013). On this regard, when a user enters a particular search query, the meaningful notions are extracted. These notions along with the aspect of domain ontology are then used as the basis for performing query expansion. Finally, a large set of documents are retrieved and then ranked based on relevance to the given user query. This ranking of results is a very essential need for the users of a specialized microbiology search tool. In this sense, this chapter acts as a crucial building block to the success of this research.
The successful implementation of the proposed search engine depends on a comprehensive analysis of the user and provider requirements. This means conducting an analysis first, by questionnaires, and/or observing how users interact with existing search engines, and then, on the basis of the results found determine the requirements catalogue.From the survey conducted, the users provided suggestions on the type of services they wish to enjoy from the search engine. Therefore, the quantitative type of data that can be converted to useful and achievable deliverables will be used to create the architecture of the final search engine tool.
The results of the Questionnaire Survey on Exiting Search Engines
- The search engines are used by both males and females with proportions of 62.43% and 37.57% respectively (Q2).
- The existing search engines are utilized by people of varied age groups. The majority of the users are aged between 20-30 forming 61.38% of the sampled population. The age group 30-55 follow with 32.80% then under 20 with 5.29%. Those over the age of 55years are the least users at 0.53% (Q3).
- A look at how the sampled population frequently used the computer for studies shows that they are vital part of studying for majority. For example, 82.54% of the students used computers daily for study followed by 13.2%, weekly, and 3.70%, monthly. It is significant to note that only 0.53% commented noted to use them at all (Q4).
- In regard to the commonly used search engine, Google won with 98.94%, followed by Yahoo and Bing at 2.66% each and at last position, Ask.com, 1.06% (Q5).
- From Q6, it is evident that the search engines are very helpful in study with students rating it at 4.33 out of 5.
- Students often use search engines in study with 65.08% using them on daily basis, 28.5% on weekly basis, and 2.38% on monthly basis. A negligible fraction, 3.978% never used them for study (Q7).
- 75% indicated that they do not use an E-learning system or a specialised search engine for microbiology. Only 18.25% used either of them (Q8).
- In regard to the rate of satisfaction with existing search engines, 21.43% indicated to be very satisfied, 42.86% indicated to be satisfied, 32.54% were neutral and 2.38%, dissatisfied (Q9).
- In Q11, 81.75% indicated that the search engines have the ability to give result in more than one language. 18.25% said no.
- From Q12, “Are all subfields of microbiology considered in the search engine?” , majority, 51.59% said “NO” and 48.41% said “Yes”.
- When the sampled users were asked whether the new search engines should be more friendly-user than current ones, 25.56% strongly agreed and 50% agreed. 24.44% were neutral. On contrary, no one, 0% disagreed or strongly disagreed (Q13).
- There was strong need to integrate the proposed search engine with other web resources to access more search engines. This was portrayed by a good number of the sampled users when they were asked on the same. 36.67% strongly agreed, 48.89% agreed, 12.22% were neutral and only 2.22% disagreed (Q14).
- On whether the proposed search engine should allow for customized search on microbiology content, 26.67% strongly agreed, 43.33% agreed, 28.98% were neutral and 1.11% strongly disagreed (Q15).
- In respect to the inclusion of a multilingual feature into the proposed search tool, 53.33% strongly agreed, 33.33% agreed, 8.89% were neutral and 4.44% disagreed (Q16).
- The users were also asked whether ‘search-as-you-type’ facility should be included in the proposed search tool (Q17). From the results, 44.44% strongly agreed, 40% agreed, 14.44% were neutral and 0% disagreed.
- In Q18, the author wanted to know whether it was important to provide access to other common search engines such as Google. The statistics indicated 44.4% strongly agreed, 38.89% agreed, 13.33% were neutral, 2.22% disagreed and 1.11% strongly disagreed.
- In, Q19, the author sought the users’ opinion whether it was necessary to include a user instructions kit on using the tool and searching through different sub fields in From the results, majority found it necessary. 40% strongly agreed and 36.67% agreed whereas 23.33% were neutral. On the contrary, no one disagreed.
- With regard to currency of microbiology content and research, the majority said that the proposed search tool should include both the past and the current resources. Metrics indicated that 41.11% strongly agreed, 36.67% agreed, 22.22% were neutral and no one disagreed (Q20).
- One of the important questions in the survey was whether there was need to include microbiology content as sub disciplines and other forms of classifications (Q21). The results indicated that there was need to do so. 34.44% strongly agreed, 43.33% agreed, 21.11% were neutral and only 1.11% disagreed.
- In Q22, the sampled users were asked whether it was necessary to include links to other microbiology learning resources. Basically, the statistics implied it was necessary. 37.65% strongly agreed, 40% agreed, 21.118% were neutral and only1.18% disagreed.
- In Q23, the author found it crucial to determine the significance of a microbiology search engine in enhancing lectures, tutorials and practical studies. Indeed, the microbiology search engine was deemed significant for supporting learning activities. 41.18% strongly agreed, 37.65% agreed, 20% were neutral and only 1.18% disagreed.
- Do specialised search engines bring students with similar subject interests together through information exchange? (Q24). From the results, it can be concluded as “Yes”. This is because 38.82% strongly agreed, 43.53% agreed, 12.94% were neutral, 3.53% disagreed and 1.18% disagreed.
- Above, the sampled microbiology students indicated the web search engines eliminate the constraint of learning resources in microbiology (Q25). 18.2% strongly agreed, 41.18% agreed, 32.94% were neutral, 5.88% disagreed and 1.18% strongly disagreed.
Resolutions from the Survey
A critical review of the results from the survey clarified on issues that need to be considered in the development of the proposed Specialised Search Engine for Microbiology (SSEM). It was clear that users expect a tool that supports;
- Multi-lingual feature which allows for search in more than one language.
- High user-friendliness
- Topical and other forms of classification for organizing search results.
- Customization of the search
- Search-as-you-type feature which provides insight to user when launching a search.
- Accurate and relevant results.
- simple but advanced search
- All fields in microbiology
Among the key functionalities that will achieve the targeted user requirements determined in earlier research are:
- Faster retrieval of information from a wide set of predefined online sources
- A granular topical selection,
- searching within content and metadata
- Metadata is title, subject, author and/or language and inclusion of non-English documents in the corpus.
- Expansion of queries to gather and filter required information or results
- Ranking of results based on relevance to subject matter
- A module for processing Language based search operations
- Categorization of results on the basis of key microbiology topics
In order to achieve these functionalities, the proposed search engine will be bound to effective approaches that include; application of semantic technologies, existing free tools as the tool is intendant to be free for use, comprehensive database strategies, and incorporating cloud computing strategies.
Firstly, from the user interaction, the semantic technologies will be explored to determine if they are an practical approach to ensuring that there is an organized extraction and exploitation of valuable information from technical documents. This approach applies robust information search processes in conjunction with machine learning platforms, which is one of the most valuable aspects in promoting the ability of the targeted search engine to extract efficient data needed in the day to day activities of a microbiology student or researcher (Garzoli et. Al, 2013).
The search engine for Microbiology shall therefore be developed using simple tools and easily available resources. For data classification, NLTK building package shall be used. Furthermore, machine learning methods techniques shall be employed to increase the tool’s effectiveness.
Secondly, for any online computer system to be considered high quality, it is always important that a well-designed database provides a standard platform for storage and retrieval of data. In this respect, two approaches are necessary to be applied to the data storage requirements of the targeted search engine. These are: (1) auser driven approach and (2) adata driven approach, which will work together as one coherent component. For the user driven approach, query instructions have to be designed in such a manner that they search and retrieve data from the database with emphasis on the level of education of a microbiology student. On the other hand, the data driven approach will ensure that sets of closely related materials regarding the searched subject are presented to the user (Abai, 2013).
Thirdly the integrated connectivity to the cloud network will be considered. This is a very important requirement as it ensures access to a wide set of global data. Cloud network plays a huge in promoting faster and relevant data access. It also eases the workload on a local database by creation of a virtual data access centre where resources can be stored and retrieved upon need (Schadt, 2011).
Semi-structured interviews will be held with specialists inthe various fields of microbiology that include applied and pure microbiology to help determine the organization of microbiology content.In the pre-implementation interviews, the specialists will be asked about the organization and relevance of search results from the existing search engines in their work.There will be several post-implementation interviews to gain insight on their experience of using different search toolsfor microbiology and techniques they use to achieve relevant feedback.The pre-implementation interview will be between 40 and 60 minutes and the post-implementation interviews will be for a similar period of time. The interviews will be recorded as notes and analysed.
Semi-structured interviews will be held with search engine IT and application specialists to assess the technical aspects and range of specific search engines, learning algorithms and retrieval, processing, and presentation techniques that are used in web tools and techniques and how they perform in practice. In the pre-implementation interviews with the specialists they will be asked about their experiences in determining user and provider requirements.
After the determination of the user requirements, the software specialists will help to review the development of the tool and its components as part of the tool development process for the software, firmware and hardware and web retrieval algorithm experiences.
The research will also determine a number of areas and specific microbiology topics to evaluate how effective the search tool will be with some current locally important microbiology searches of existing literature. The effect of these tests on the requirements and design of the search tool will be noted. A range of potential searches considered relevant to microbiology in Saudi Arabia will be considered to ensure the tool is not restricted by several tests.
Saudi Arabia has pressing needs for the adoption of advanced microbiology learning activities. Saudi Arabia has a number of challenges in microbiology. A better search engine which encourages students is a key component in promoting interest in microbiology .success of this proposed project. Promoting the adoption of advanced microbiology learning activities in Saudi Arabia is one of the key aims of this research project.
The socio-economic, cultural and educational practices in Saudi Arabia have beenconsidered as a subject of concern to the nation’s progress. The implementation of the planned tool will be fundamental in promoting the wider and deeper study of microbiology.
This project places major emphasis on some areas of the Saudi ways of life where there is need for significant microbiology considerations.
A significant number of issues of concern in the Saudi environment include the hygiene of water used by most of the citizens. This is particularly in areas such as the Al baha province where people consume contaminated water (Omer et. Al, 2014). This water is shared with the camels and yet there are no strategies applied to prevent any microbiological infections that come along with the use of unsafe drinking water. The quality of water in the region requires significant microbiological follow up in order to save the people from a wide number of deadly water borne diseases (Omer et. Al, 2014).
Another major requirement in Saudi Arabiais the effect of microbiological infections that cause infections such as the coronavirus infections that affects dromedary camels and spreads to humans. This virus poses a significant threat to the health of camels and livestock. In fact, if not treated with the right antibodies the virus can easily result in large numbers ofdeaths among livestock.
Lastly but certainly not the least, there is the subject of respiratory tract infection, which has hugely been experienced the periods of Hajj. Hajj is an Islamic religious practice where almost three million Muslims from all around the globe perform mass gathering in the Saudi for religious practice called pilgrimage. In some cases, this has resulted in the spread of a number of diseases, which include respiratory infections. The most common respiratory tract infections that have been recorded include influenza and rhinovirus. To reduce the spread of these infectious diseases, the use of masks, hand hygiene and vaccination has always been motivated.
In general, all this set of information has to be well integrated into the proposed microbiology search engine. It will play a very crucial role in enhancing the levels of knowledge of microbiology in the United Kingdom, Saudi Arabia and the world at large.
The ever increasing and complex data needs for biological subjects has in recent years become a crucial subject of concern. To aid in promoting fast and efficient data access technological professionals have been working hard to come up with technologies that will promote better access to scientific materials. In this sense, a significant set of microbiology requirements have proved to be key in ensuring that the targeted search engine ends up a successful project. In simple terms, the microbiology requirements have been divided into two major groups, which are stipulated below.
Microbiology search engine requirements
- A set ofsemantic functional procedures that will enhance online data search through knowledge based and intelligent discovery of web contents. The semantics need to be incorporated both in the data and the search engine. This will be very important in ascertaining that the targeted microbiology search engine is achieved.
- A module to integrate searched information into the predefined sets of semantic functionalities. In essence, the aim is to provide computational solutions for the large data volumes that has undergo timely management and analysis before being to the user. This is to aid in faster retrieval of the requested information (Schadt, 2010).
Microbiology Student/Researcher Requirements
- A search engine that can speed their research by offering relevant results.
- A search engine that can process text and group results in relevant subcategories of Microbiology.
- A search engine that can process language specific queries while returning most relevant results in a desired format and category.
It is important to outline the number of pressing needs that this application will cater for.
Firstly, it will play a role in promoting granular topical selection, which has been a key challenging issue for microbiology students and researchers. Most of the current search engines provide wide and irrelevant search results that make it challenging for microbiology students to enjoy direct access to useful resources. Therefore, this application will cater for the need to access information search results that accurately relates to required subject of concern (Gupta & Singh, 2013).
In addition, the need for search process to be conducted within the metadata and content can easily be catered by this proposed application. Through adoption of semantic technologies along with direct connection to the cloud network, users will enjoy better access to the required information.
Last, there is the need to havean optimized search engine that delivers the most relevant results within reliable or reduced time frame and using less processing power is another key aspect that this application will cater for (Gupta & Singh, 2013).
Evaluation will involve using real users in testing the functionality and capability of the search engine tool. The search will be configured on a server – for testing purposes, we shall use free hosting services. One hundred students studying Biology related courses will be given the URL to the search engine tool to research about a microbiology subject. Instructors will be used to evaluate the students’ response time from the research.
On the search engine tool page, a survey button will be available for the students to give feedback on how they found the search engine tool. They will be asked whether it was helpful and how easy and efficient it was to find information. The results will be stored in a database within the search engine tool for analysis.
This chapter has focused on a thorough analysis of the user requirements for the day-to-day operations of the proposed microbiology search tool. It begins by outlining that the success of the proposed system depends mainly on its ability to meet the expected functional features by the users. These users include students, practitioners and other professionals in the field of microbiology.
To gain a clear understanding of the user requirements, a significant number of sections have been considered as very essential in coming with the appropriate analysis. These include a section on the Existing Search Engines, an approach to User Requirements Analysis, the Saudi Requirements Analysis, the Microbiology Requirements Analysis and the most pressing needs the application users require. Consequently, it has been found out that the targeted users require the specialized microbiology search engine to offer services such as faster retrieval of information and ranking of results on the basis on relevance to the subject matter.
Abai, N, Yahaya, J &Deraman, A. (2013).User Requirements Analysis in Data Warehouse Design: A review.Retrieved from http://www.sciencedirect.com/science/article/pii/S2212017313004155
Allemang, D. and Hendler, J. (2011) Semantic Web for the Working
Ontologist: Effective Modeling in RDFS and OWL. 2nd ed. New York: Elsevier.
Arrigo, M., Gentile, M., Taibi, D., & Di Giuseppe, O. 2005. Specialized Search Engines for E-learning. Italian National Research Council- Institute for Educational Technology. Palermo, 8, 223
Bu, M.A.O., Lin, T.I.A.N., & Wen, X.I.E. 2010.Emprical Research on Student of Internet Based on Clustering [J]. Journal of Sichuan University of Science & Engineering (Natural Science Edition), 6, 020
Dabrock, P., Taupitz, J. &Ried, J., 2012. Trust in Biobanking:
DealingWith Ethical, Legal and Social Issues in an Emerging Field of Biotechnology. London: Springer.
Dalkilic, M., Kim, S. & Yang, J., 2006. Data Mining and
Duch,W., 2005. Artificial Neural Networks: Biological Inspirations
Chauhan, R., Goudar, R., Sharma, R., &Chauhan, A. 2013. Domain ontology based semantic search for efficient information retrieval through automatic query expansion.Department of Computer Science &Engineering, Dehradun.
First International Workshop, VDMB 2006, Seoul, Korea,
September 11, 2006, Revised Selected Papers. illustrated ed. London: Springer.
Gama, J., 2005. Machine Learning — ECML 2005: 16th European
Conference on Machine Learning, Porto, Portugal, October 3-7, 2005 : Proceedings. illustrated ed. London: Springer.
Garzoli, F, Croce, D, Nardini, M, Ciambra, F &Basili, R. (2013). Robust Requirements Analysis in Complex Systems through machine Learning. Retrieved fromhttp://link.springer.com/chapter/10.1007/978-3-642-45260-4_4
Gupta, S & Singh, R. (2013). Search Engine Optimization – Using Data mining Approach. Retrieved from
HeracleBioSoft, 2012. Using Google for Biology Searches.
Icann 2005: 15th International Conference, Warsaw, Poland,
September 11-15, 2005, Proceedings.illustrated ed. Basel, Switzerland: Birkhäuser.
Jaffar, A, Alimuddin, Z &Ziad, M. (2014). Respiratory tract infections during the annual Hajj: potential risks and mitigation strategies. Retrievedfromnhttp://journals.lww.com/co pulmonarymedicine/Abstract/2013/05000/Respiratory_tract_infections_during_the_annual.3.aspx
Karaman, F., 2012. Artificial Intelligence Enabled. [Online]
Kell, D.B., Darby, R.M., & Draper, J. 2001. Genomic computing. Explanatory analysis of plant expression profiling data using machine learning. Plant Physiology, 126, (3) 943-951
Levene, M., 2010. An Introduction to Search Engines and Web
Navigation. 2 ed. Hoboken, New Jersey: John Wiley & Sons.
Life in Research, 2008.Life Sciences Search Engine. [Online]
Available at: <http://vadlo.com/About_Vadlo.html>.
Lindquist, A.M., Johansson, P.E., Petersson, G.r.I., Saveman, B.I., & Nilsson, G.C. 2008. The use of the Personal Digital Assistant (PDA) among personnel and students in health care: a review. Journal of medical Internet research, 10, (4)
Lytras, MD 2013, Information systems, e-learning, and knowledge management research 4th World Summit on the Knowledge Society, WSKS 2011, Mykonos, Greece, September 21-23, 2011. Revised selected papers. Berlin: Springer.
Omer, E, Algamidi, A, Algamidi, I, Fadlelmula, A &Alsubaie S. (2014).The hygienic-related microbiological quality of drinking water sources Al-Baha Province, Kingdom of Saudi Arabia. Retrieved from http://www.thejhs.org/article.asp?issn=1658-600X;year=2014;volume=2;issue=2;spage=68;epage=74;aulast=Omer
Payne, K.F.B., Wharrad, H., & Watts, K. 2012. Smartphone and medical related App use among medical students and junior doctors in the United Kingdom (UK): a regional survey. BMC medical informatics and decision making, 12, (1) 121
Risvik, K.M. &Michelsen, R. 2002. Search engines and web dynamics. Computer Networks, 39, (3) 289-302
Rosenberg, A. & Arp, R., 2009. Philosophy of Biology: An
Anthology. illustrated ed. New Jersey: John Wiley & Sons.
Schadt, E, Linderman, M, Sorenson, J, Lee, L & Nolan, G. (2011).Cloud and heterogeneous computing solutions exist today for the emerging data problems in biology: Nature reviews. Retrieved from http://www.nature.com/nrg/journal/v12/n3/full/nrg2857-c2.html
Taibi, D., Gentile, M., & Seta, L.A semantic search engine for learning resources, Citeseer.
Trigg, R. H., Blomberg, J., &Suchman, L. Moving document collections online: The evolution of a shared repository, Springer, pp. 331-350.
Walia, R.R. 2008. Collaborative Filtering: A Comparison of Graph-based Semisupervised Learning Methods and Memory-based Methods.
Wang, P., Morgan, A.A., Zhang, Q., Sette, A., & Peters, B. 2007. Automating document classification for the Immune Epitope Database. BMC bioinformatics, 8, (1) 269
Wolf, J. L., Squillante, M. S., Yu, P. S., Sethuraman, J., &Ozsen, L. Optimal crawling strategies for web search engines, ACM, pp. 136-147.