關(guān)于我們
書單推薦
新書推薦
|
R科學(xué)計量數(shù)據(jù)可視化(第二版) 本書詳細(xì)介紹了意大利那不勒斯菲里德里克第二大學(xué)Massimo Aria和Corrado Cuccurullo基于R語言開發(fā)的BIBLIOMETRIX工具包。該R工具包基本上涵蓋了進(jìn)行科學(xué)計量和知識可視化的功能,可以滿足愛好R軟件,并試圖使用R進(jìn)行科學(xué)計量和知識圖譜分析的讀者。在此基礎(chǔ)上,本書對于科學(xué)計量與知識圖譜相關(guān)的一些R工具包,包括rAltmetric、wordcloud2、gender以及tidytext等工具包進(jìn)行了簡要介紹。 Preface We heard about bibliometrics 10 years ago for the first time. In 2008 Corrado was writing a monograph on fast growing firms, a niche theme, which he approached for the first time. Scientific literature was fairly limited. Scholars came from different disciplines with a variety of approaches and methods that made it difficult to cumulate the findings. We talked about this research problem once during a football match among scholars. Our discussion continued for several days on the various techniques of systematic analysis of literature. We enjoyed the exchange and concluded that bibliometrics was an interesting method and that it would have been fun to explore it together. Our goal became to examine the intellectual structure of fast growing firms research. We analyzed all the scientific production published in academic English-written journals. The analysis was complex because it required several steps and diverse analysis and mapping software tools, which were often available only under commercial licenses. All the process was unwieldly, from data-collection to data-visualization. Massimo greatly contributed with his statistical and coding skills. Our collaboration continued in moments of fun, such as our frequent football matches. While analyzing data, we discovered that we enjoyed working together. In short, our friendship soon turned into a scientific collaboration that still lasts. Within our departments and academic communities, the reaction to our work was positive. At that time, few people talked about bibliometrics in Italy, even from the point of view of research evaluation. Years later we presented a bibliometric analysis paper on performance management at the Annual Conference of the Academy of Management, the largest international management meeting. Also on that occasion, we got positive feedbacks that pushed us to persist. In the same years, young Italian colleagues asked us for suggestions for their literature reviews and for their research. Massimo opened some statistical analysis laboratories in R and together we presented the bibliometric analysis at some workshops. We are telling this story because without these feedbacks and stimuli we would not have published the bibliometrix release 0.1 in 2016. A year later we are at version 1.7, thanks to our growing passion for bibliometrics and to the suggestions that today come from scholars from all around the world. R-bibliometrix is currently a free tool for quantitative research in scientometrics and bibliometrics that includes all the main bibliometric methods of analysis, easy to use even for those who have no coding skills. Bibliometrix is a unique tool, developed in the statistical computing and graphic R language, according to a logical bibliometric workflow. R is highly extensible because it is an object-oriented and functional programming language, and therefore is pretty easy to automate analyses and create new functions. As it has an open-software nature, it is also easy to get help from the users community, mainly composed by prominent statisticians. Therefore, bibliometrix is flexible and can be rapidly upgraded and can be integrated with other statistical R-packages. That why, it is useful in a constantly changing science such as bibliometrics. Today bibliometrix is more than just a statistical tool. It is becoming a community of international developers and users who exchange questions, impressions, opinions, and examples within an open source project. For this reason, we are very honored that Dr Jie Liof the Research centerfor Safety and security SCITECH trends at the Department of Safety Science and Engineering, Shanghai Maritime University gave us the opportunity to tell you this story and to write an English preface for his book “Using R for Scientometrics data Visualization” that mainly introduces the BIBLIOMETRIX package to scholars and students. We said that Bibliometrix includes all the main bibliometric methods of analysis, but we use it especially for science mapping and not for measuring science, scientists, or scientific productivity. Synthesizing past research findings is one of the most important tasks in advancing a line of research. Various methods exist to summarize the amount of scientific activity in a domain, but bibliometrics has the potential to introduce a systematic, transparent and reproducible review process. This is very relevant in an age when the number of academic publications is rising at a very fast pace and it is increasingly unfeasible to keep track of everything that is being published; and when the emphasis on empirical contributions is resulting in voluminous and fragmented research streams, and a contested feld. Literature reviews are increasingly playing a crucial role in synthesizing past research findings to effectively use the existing knowledge base, advance a line of research, and give evidence-based insights into the practice of exercising and sustaining professional judgment and expertise. The overwhelming volume of new information, conceptual developments and data are the milieu in which bibliometrics becomes useful, by providing a structured analysis to a large body of information, to infer trends over time, themes researched, identify shifts in the boundaries of the disciplines, to detect most the prolifc scholars and institutions, and to show the “big picture” of extant research. Naples, Italy July 2017 Massimo Aria and Corrado Cuccurullo 前言 當(dāng)前,我們正處于科學(xué)文獻(xiàn)大數(shù)據(jù)時代。面對海量的文獻(xiàn),我們?nèi)绾慰焖俚亓私庖粋研究領(lǐng)域、研究方向或者主題的整體格局以及未來的趨勢?在此背景下,與該問題直接相關(guān)的科學(xué)計量理論、方法和技術(shù)的適時發(fā)展,成為解決上述科研問題的一種有效的途徑。掌握與科學(xué)計量相關(guān)的技術(shù)和方法也成為科研工作者在新時代進(jìn)行科學(xué)研究活動的基本技能要求。在過去十余年里,科學(xué)計量數(shù)據(jù)可視化的理論與方法已經(jīng)大量地滲透到其他學(xué)科的研究實踐中。在國內(nèi),這種以科學(xué)文本數(shù)據(jù)為研究對象,通過可視化技術(shù)來揭示學(xué)科結(jié)構(gòu)、演進(jìn)和互動的研究領(lǐng)域被統(tǒng)稱為“科學(xué)知識圖譜”。 科學(xué)計量數(shù)據(jù)可視化背后涉及大量的科學(xué)計量學(xué)(還包含文獻(xiàn)計量學(xué)、網(wǎng)絡(luò)計量學(xué)以及信息計量學(xué))方面的基礎(chǔ)理論,比如論文的作者生產(chǎn)率分布、論文的共被引、耦合、主題共現(xiàn)以及作者合作等。還包含了統(tǒng)計學(xué)和網(wǎng)絡(luò)科學(xué)等方面的技術(shù)和方法,比如多維尺度分析、聚類分析、復(fù)雜網(wǎng)絡(luò)分析、自然語言處理和文本挖掘等分析方法。上述理論和方法構(gòu)成了進(jìn)行科學(xué)計量數(shù)據(jù)可視化分析的知識基礎(chǔ),是進(jìn)行知識圖譜分析的前提。在理論和方法的支持下,當(dāng)前國內(nèi)外的相關(guān)學(xué)者已經(jīng)開發(fā)了數(shù)十種科技文本挖掘方面的軟件或者工具包,這些知名的工具包含了HistCite、BibExcel、CiteSpace、SCI2以及VOSviewer等。這些工具為有意借助領(lǐng)域文獻(xiàn)分析以獲取學(xué)科研究格局和動態(tài)的學(xué)者提供了可能。 筆者在過去5年從事科學(xué)計量和知識圖譜的實踐研究中,相繼撰寫了關(guān)于CiteSpace、VOSviewer以及BibExcel等方面的書籍,主要目的在于幫助非科學(xué)計量學(xué)領(lǐng)域的學(xué)者快速應(yīng)用該方法輔助科學(xué)研究。從2016年開始,已經(jīng)相繼組織了4次與科學(xué)計量和知識圖譜相關(guān)的活動,與來自國內(nèi)的數(shù)百名知識圖譜愛好者有過交流。在交流中,最為常見和令我反思的一個問題是:“我得到的圖譜結(jié)果應(yīng)該怎樣解釋呢?”我認(rèn)為,科學(xué)計量及知識圖譜的方法僅僅給我們提供了一種認(rèn)識知識世界的新方式,但這種認(rèn)識方式更需要知識圖譜實踐者結(jié)合自身的專業(yè)背景和知識圖譜的理論與方法去思考。在進(jìn)行科學(xué)計量和知識圖譜分析的時候,讀者一定要明確自己要解決的問題是什么,以及為什么知識圖譜能夠解決提出的問題,它與其他方法相比優(yōu)勢在哪里,等等。即在進(jìn)行科學(xué)計量和知識圖譜分析之前,一定要確定自己所要研究的問題,然后來選擇使用何種知識圖譜呈現(xiàn)方式解決問題。 本書是《CiteSpace:科技文本挖掘及可視化》《科學(xué)計量與知識網(wǎng)絡(luò)分析——基于BibExcel等軟件的實踐》《科學(xué)知識圖譜原理及應(yīng)用——VOSviewer與CiteNetExplorer初學(xué)者指南》的姊妹篇。與前面這些應(yīng)用程序不同的是,該書詳細(xì)介紹了意大利那不勒斯菲里德里克第二大學(xué)(University of Naples Federico II)經(jīng)濟(jì)與統(tǒng)計系Massimo Aria和Corrado Cuccurullo基于R語言開發(fā)的BIBLIOMETRIX工具包。建議讀者在應(yīng)用時通過提供的鏈接來檢查是否為最新版的BIBLIOMETRIX,在實際的研究中盡可能使用最新版來對數(shù)據(jù)進(jìn)行分析(BIBLIOMETRIX-R Package for Bibliometric and Co-Citation Analysis,http://www.bibliometrix.org/)。該R工具包基本上涵蓋了進(jìn)行科學(xué)計量和知識可視化的功能(圖0. 1),可以滿足愛好R軟件,并試圖使用R進(jìn)行科學(xué)計量和知識圖譜分析的讀者。在此基礎(chǔ)上,對于科學(xué)計量與知識圖譜相關(guān)的一些R工具包,如rAltmetric、wordcloud2、gender以及tidytext等工具包進(jìn)行了介紹。本書對使用R進(jìn)行英文全文本挖掘的介紹很少,對中文全文本挖掘尚未涉及。在今后的更新中將對使用R進(jìn)行全文本挖掘進(jìn)行適當(dāng)?shù)耐晟啤?/p> 圖0. 1bibliometrix功能概覽 為了便于讀者熟悉bibliometrix工具包,本書大多數(shù)的案例運行采用了工具包自帶的數(shù)據(jù),一些案例專門下載了Web of Science和Scopus數(shù)據(jù)集并進(jìn)行了分析。案例中呈現(xiàn)了所分析的結(jié)果,但并未就結(jié)果進(jìn)行描述性或者帶有特定研究目的的解讀。讀者通過對這些結(jié)果的學(xué)習(xí),自己去思考可以做些什么,或者至少可以通過這種方法了解自己所關(guān)注領(lǐng)域的基本情況。 本書在撰寫中有如下約定: >后為代碼 #為代碼的說明 ##為代碼運行的結(jié)果 感謝Massimo Aria和Corrado Cuccurullo,他們在本書寫作過程中給予了大力幫助,并為本書撰寫了英文序言。感謝首都經(jīng)濟(jì)貿(mào)易大學(xué)出版社楊玲社長對科學(xué)計量與知識圖譜系列叢書出版的極力支持,感謝中國科學(xué)院李彬彬博士在提取子矩陣問題上的幫助,感謝滑鐵盧大學(xué)博士后于淼對文稿提出的修改建議,感謝本書的責(zé)任編輯薛曉紅以及研究生李平對本書的編輯和詳細(xì)校對。 回首自己在科學(xué)計量和知識圖譜研究與實踐上的經(jīng)歷,感受五味雜陳。衷心地期望本書及相關(guān)系列叢書能進(jìn)一步促進(jìn)科學(xué)計量與知識圖譜實踐研究在國內(nèi)的發(fā)展和普及,并使每一位讀者受益。 李杰 2018年5月于北京 李杰, 博士/博士后,1987年生于陜西,F(xiàn)為中國科學(xué)院文獻(xiàn)情報中心副研究員,研究領(lǐng)域為科學(xué)計量學(xué)與安全科學(xué)。擔(dān)任Journal of Integrated Security and Safety Science共同主編、《安全與環(huán)境學(xué)報》青年編委會副主任、Safety Science等期刊編委,全國科學(xué)計量學(xué)與信息計量學(xué)專業(yè)委員會委員。發(fā)表學(xué)術(shù)論文60余篇,出版了《CiteSpace:科技文本挖掘及可視化》、《科學(xué)知識圖譜原理及應(yīng)用》、《科學(xué)計量與知識網(wǎng)絡(luò)分析》以及《R科學(xué)計量數(shù)據(jù)可視化》等著作6部。 目錄 第1講R基礎(chǔ) 1 1.1R下載 1 1.2R安裝 3 1.3Rstudio安裝 5 1.4安裝包 6 1.5加載包 8 1.6包幫助 8 1.7引用包 9 1.8包數(shù)據(jù)調(diào)用 10 1.9用戶數(shù)據(jù)加載 12 1.10編程錯誤 13 第2講科學(xué)計量數(shù)據(jù)采集 14 2.1WoS數(shù)據(jù) 14 2.2Scopus數(shù)據(jù) 17 2.3PubMed數(shù)據(jù) 19 第3講R科學(xué)計量分析基礎(chǔ) 21 3.1R數(shù)據(jù)轉(zhuǎn)換 21 3.2數(shù)據(jù)列名的意義 22 3.3數(shù)據(jù)集合并 23 3.4數(shù)據(jù)的除重 25 3.5數(shù)據(jù)的切片 26 3.6數(shù)據(jù)的編輯 27 3.7描述性分析 28 3.8統(tǒng)計可視化 33 3.9引文信息分析 36 3.10Altmetric信息 38 3.11作者排名分析 39 3.12作者性別判斷 40 3.13h類指數(shù) 42 3.14Lotka分析 44 3.15知識單元時序分布 46 3.16文獻(xiàn)與作者LCS計算 50 3.17被引次數(shù)標(biāo)準(zhǔn)化 52 3.18術(shù)語提取 54 第4講R科學(xué)數(shù)據(jù)可視化 58 4.1知識單元隸屬矩陣 58 4.2知識單元共現(xiàn)矩陣 60 4.3隸屬矩陣的子矩陣 63 4.4共現(xiàn)矩陣的子矩陣 64 4.5共現(xiàn)矩陣標(biāo)準(zhǔn)化 66 4.6網(wǎng)絡(luò)的可視化 67 4.7VOSviewer的可視化 70 4.8合作網(wǎng)絡(luò)可視化 71 4.9耦合網(wǎng)絡(luò)可視化 75 4.10共被引網(wǎng)絡(luò)可視化 76 4.11歷史引證網(wǎng)絡(luò)分析 78 4.12共詞網(wǎng)絡(luò)可視化 80 4.13術(shù)語概念結(jié)構(gòu)圖 83 4.14語義地圖分析 86 4.15主題演化可視化 89 4.16詞云可視化 93 4.17PuMed數(shù)據(jù)可視化 96 4.18全文本挖掘及可視化 97 4.19高產(chǎn)作者動態(tài) 105 4.20耦合網(wǎng)絡(luò)戰(zhàn)略坐標(biāo)圖 106 4.21參考文獻(xiàn)時間可視化 108 4.22分割網(wǎng)絡(luò)圖 110 第5講網(wǎng)頁版R-biblioshiny 113 5.1數(shù)據(jù)導(dǎo)入與格式轉(zhuǎn)化(Data) 114 5.2數(shù)據(jù)篩選(Filter) 115 5.3數(shù)據(jù)集主要信息(Dataset) 116 5.4出版源信息(Sources) 119 5.5作者信息(Authors) 122 5.6文檔信息(Documents) 127 5.7聚類(Clustering) 132 5.8概念結(jié)構(gòu)(Conceptual Structure) 133 5.9認(rèn)知結(jié)構(gòu)(Interllectual Structure) 138 5.10社會結(jié)構(gòu)(Social Structure) 140 第6講上機(jī)實驗 141 6.1特定作者的論文計量 141 6.2特定論文的科學(xué)計量 152 6.3特定機(jī)構(gòu)的論文計量 163 6.4特定期刊的比較計量 175 6.5特定會議論文的計量 192 6.6特定主題文獻(xiàn)的計量 203 6.7特定方法文獻(xiàn)的計量 219 參考文獻(xiàn) 230 附錄 232 附錄1R科學(xué)計量核心代碼 232 附錄2Web of Science核心字段含義 237 附錄3常用的科學(xué)計量數(shù)據(jù)可視化工具 239 附錄4R科學(xué)計量數(shù)據(jù)可視化工具包 240
你還可能感興趣
我要評論
|