October 15, 2020

Creating Good Data

Disclosure: This post contains affiliate links. This means that if you click one of the links and make a purchase we may receive a small commission at no extra cost to you. As an Amazon Associate we may earn an affiliate commission for purchases you make when using the links in this page.

You can find further details in our TCs

Creating Good Data
Create good data from the start, rather than fixing it after it is collected. By following the guidelines in Creating Good Data, you will be able to conduct more effective analyses and produce timely presentations of research data.

Data analysts are often presented with datasets for exploration and study that are poorly designed, leading to difficulties in interpretation and to delays in producing meaningful results.  Much data analytics training focuses on how to clean and transform datasets before serious analyses can even be started. Inappropriate or confusing representations, unit of measurement choices, coding errors, missing values, outliers, etc., can be avoided by using good dataset design and by understanding how data types determine the kinds of analyses which can be performed.

Creating Good Data discusses the discusses the principles and best practices of dataset creation, and covers basic data types and their related appropriate statistics and visualizations. A key focus of the book is why certain data types are chosen for representing concepts and measurements, in contrast to the typical discussions of how to analyze a specific data type once it has been selected.

Creating Good Data: What You Will Learn

  • Be aware of the principles of creating and collecting data
  • Know the basic data types and representations
  • Select data types, anticipating analysis goals
  • Understand dataset structures and practices for analyzing and sharing
  • Be guided by examples and use cases (good and bad)
  • Use cleaning tools and methods to create good data

Creating Good Data: Who This Book Is For

Researchers who design studies and collect data and subsequently conduct and report the results of their analyses can use the best practices in Creating Good Data to produce better descriptions and interpretations of their work. In addition, data analysts who explore and explain data of other researchers will be able to create better datasets.
Harry J Foxwell

About the Author

Harry J. Foxwell is currently Associate Professor at George Mason University's Department of Information Sciences and Technology where he teaches graduate courses in Data Analytics. He earned his doctorate in Information Technology in 2003 from George Mason University's Volgenau School of Engineering (Fairfax, VA), and has also taught graduate courses there in operating systems, computer architecture and security, and electronic commerce.

He previously worked as Principal Engineer for Oracle's Public Sector division in the Washington DC area, where he was responsible for solutions consulting and customer education on Cloud Computing, Big Data, Solaris and Linux operating systems, and Virtualization technologies.

Harry worked for Oracle and Sun Microsystems (acquired by Oracle in 2010) from 1995 through 2017. Prior to that he worked as a UNIX and Internet specialist for Digital Equipment Corporation; he has worked with UNIX systems since 1979 and with Linux since 1995.

He is co-author of the book Pro OpenSolaris (Apress, April 2009) and Sun BluePrints: Slicing and Dicing Servers: A Guide to Virtualization and Containment Technologies (Sun BluePrints Online, October 2005); The Sun BluePrints Guide to Solaris Containers: Virtualization in the Solaris Operating System (Sun BluePrints Online, October 2006); and READ_ME_FIRST: What Do I Do With All of Those SPARC Threads? (Oracle Technical White Paper, August 2013). He is co-author of Oracle Solaris 11 System Administration The Complete Reference (Oracle Press, September 2012).

Harry is a Vietnam veteran; he served as a Platoon Sergeant in the US Army's 1st Infantry Division in 1968-1969; (awarded Air Medal, and Bronze Star).


He is also an amateur astronomer and contributing member of the Northern Virginia Astronomy Club, a USATT member and competitive table tennis player, and a USSF (Soccer) referee.


He lives in Fairfax, Virginia with his wife Eileen and two bothersome cats. 

Related Video Courses

VIDEO COURSE

Data Cleaning Bootcamp for Analysts and Researchers

Creating Good Data

Related Posts

Lee Baker is an award-winning software creator that lives behind a keyboard in a darkened room. Illuminated only by the light from his monitor, he aspires to finding the light switch. With decades of experience in science, statistics and artificial intelligence, he has a passion for telling stories with data, yet despite explaining it a dozen times, his mother still doesn’t understand what he does for a living. Insisting that data analysis is much simpler than we think it is, he creates friendly, easy-to-understand video courses that teach the fundamentals of data analysis and statistics. As the CEO of Chi-Squared Innovations, one day he’d like to retire to do something simpler, like crocodile wrestling.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
Do NOT follow this link or you will be banned from the site!