• Policy Guidance
  • Submitting Data
  • Accessing Data
  • About the Genomic Data Sharing (GDS) Policy
  • Key Documents
  • Preparing Genomic Data
  • Extramural Grantees
  • Non-NCI Funded Investigators
  • Intramural Investigators
  • Accessing Genomic Data
  • Genomic Data Sharing Policy Contact Information
  • ARPA-H BDF Toolbox
  • Cancer Research Data Commons
  • Childhood Cancer Data Initiative
  • NCI-Department of Energy Collaboration
  • Real-World Data
  • U.S.-EU Artificial Intelligence Administrative Arrangement
  • NCI Data Catalog
  • CDISC Terminology
  • FDA Terminology
  • NCPDP Terminology
  • Pediatric Terminology
  • Metadata for Cancer Research
  • Informatics Technology for Cancer Research (ITCR) Tools
  • Generating and Collecting Data
  • Cleaning Data
  • Exploring and Analyzing Data
  • Predictive Modeling
  • Visualizing Data
  • Sharing Data
  • Cancer Data Science Course
  • Training Guide Library
  • Cancer Data Science Pulse Blog
  • Data Science Seminar Series
  • Jobs and Fellowships
  • Contact CBIIT
  • Organization
  • CBIIT Director
  • Staff Directory
  • Application Support
  • Genomic Data Sharing
  • Cancer Vocabulary
  • Learn About Cancer Data Science
  • Improve My Data Science Skills

Register for the NCI Genomic Data Commons Analysis Tool Challenge

Don’t miss the chance to mix collaboration, competition, and tool development in the NCI Genomic Data Commons (GDC) Analysis Tool Challenge .

By participating in the challenge, you will:

  • leverage data types and formats from the GDC to develop your tool.
  • help provide the cancer research community with new data analysis tools within the GDC.
  • use the GDC Analysis Tool Software Development Kit to integrate an analysis tool with the GDC data portal using data from the GDC.

The GDC Data Portal Analysis Center will feature the winning tools and make them available to a broad audience of cancer researchers.

SUBSCRIBE TO UPDATES

  • Subscribe | The Content Marketer

Best AI Tools for Market Research

research data analysis tools

Ever heard of someone driving on the roads blindfolded to take their road-safety instincts for a spin? 

Me neither.

But I hear about people experimenting with the marketing equivalent more often than I’d like to. Driving marketing initiatives without consumer insight or data analysis removes the one thing you need to advance safely: Your vision. Market research gives you a clearly signposted direction, shows you where not to go and maps out where those in your space are trying to get to. 

AI tools are like a GPS in the equation. They can calculate the fastest route, anticipate roadblocks and set you up for an efficient journey. In other words, they shift your approach from manual or semi-automated labor to accurate, real-time guidance that shows you the way forward. Keep your hands on the wheel and save intuitive decisions for execution. 

So, when should you use AI for market research and which tools should be on your radar? 

To AI, or Not to AI … That’s the Question

Among worldwide market researchers, 47% claim they use AI regularly . Even so, diving into the markets has always been a mixed bag, and now, the debate over whether to incorporate AI is a hot topic. While traditional tools hold value, AI has taken market research to a new level. Here’s how:

  • Data analysis and insights: Traditional research is usually a slow grind. You collect data manually or maybe somewhat automate it with basic software. Then, you spend hours (if not days) trying to make sense of it. Artificial intelligence, on the other hand, can analyze hoards of data rapidly, uncovering patterns and trends that might go unnoticed by human analysts. Enter clear, nuanced and actionable insights.
  • Data collection methods: We previously relied on spreadsheets, gut instincts and the occasional focus group to figure out what our customers wanted. Now, AI plucks multi-source data in real time. We’re talking social media, website interactions and even video content analysis, facilitating a more comprehensive view of your market.
  • Real-time processing: In the past, gathering and analyzing data could take weeks and an endless supply of coffee. AI-powered tools give you the goods as soon as the data rolls in. With play-by-play updates (instead of waiting for the final score), you can pivot your strategy on the fly. This agility is crucial in today’s rapidly changing markets. 
  • Personalization and targeting: Traditional tools provide broad insights but don’t always hit the mark for every segment of your audience. AI can help you drill down into individual customer preferences, behaviors and needs, so you can craft hyper-targeted marketing strategies that speak directly to individuals rather than shouting into the void. 

While the traditional approach to market research is not entirely obsolete, integrating AI can up your game, helping you understand and reach your audience.

5 AI Market Research Tools You Should Know About

The ideal AI technology to integrate into your marketing strategy depends on your game plan. Where do you want (or need) to increase efficiency? How well do you know your blind spots? 

Each of the following tools offers features that cater to different aspects of marketing research, from a broad market overview to specific tasks like creating surveys or analyzing user behavior.

For a Bird’s Eye View on the Market: Semrush Market Explorer

Market researchers use Semrush Market Explorer for competitive analyses, showing insights into target markets and contenders. Glance at your market’s competitive landscape, including top players, their market shares, growth rates and trends. 

Market Explorer can either create a list of your market’s top performers, identify your specific challengers or analyze business categories for industry data. If you need rich insights from multiple sources to understand market competition and identify opportunities, this is the tool.

Semrush Market Explorer Features

  • Competitor benchmarking showing how you stack up in the competitive environment. 
  • Audience segmentation , including demographics, behavior and interests. 
  • Market trend analysis , covering growth, size and top dogs.

Subscribe to The Content Marketer

Get weekly insights, advice and opinions about all things digital marketing.

Thanks for subscribing! Keep an eye out for a Welcome email from us shortly. If you don’t see it come through, check your spam folder and mark the email as “not spam.”

Creating Surveys: SurveyMonkey Genius

Looking for insights straight from the source? SurveyMonkey Genius is a survey and feedback management system that leverages its 25-year runnings to create automated surveys. Enter your goals into a prompt, and generative AI will produce high-quality surveys to send to your target audiences. You can collect, interpret and analyze data to fuel growth and innovation. 

SurveyMonkey Genius Features

  • Survey draft scores , encouraging improved structure and question formats to enhance user experience and boost completion rates. 
  • Predictive analytics generate answer choices that eliminate bias and ensure accurate data collection. 
  • Machine learning models to comb through responses and shine a light on statistically significant trends.

Enhanced UX: Hotjar

Hotjar uses AI to help you understand user behavior on your website. By analyzing clicks, scrolls and navigation patterns, this behavioral analytics platform provides deep insights into what’s working and what’s not. Its AI algorithms help you identify friction points in the user journey, making it easier for UX designers and product managers to optimize their website performance for better conversions.

Hotjar Features

  • Heatmaps and session recordings show how your audience interacts with your website.
  • Conversion funnel analysis creates conversion steps and drop-off visualizations as well as market segment comparisons. 
  • A feedback tool allows you to collect instant visual user feedback, removing the guesswork from optimization.

For Ideation: ChatGPT

Whether you need content topic ideas, structured outlines or draft copy in a flash, ChatGPT is a powerful AI tool for market research and creative brainstorming. Its generative AI capabilities enable it to draw ideas for social media, SEO, lead magnets and more, based on your competitive environment. This one is a great AI tool for small businesses with resource limitations.

ChatGPT Features

  • Content ideation and creation derived from what’s currently performing online. 
  • Broad insights into the target audience’s behaviors and preferences within your market.
  • Marketing strategy methodology and execution support.

For Basically Everything: Google Analytics

Google Analytics might be the grandfather of digital marketing tools, but it’s still one of the best. It’ll sift through mountains of data to show who your audience is, where they come from, which devices they use and how they interact with your site. 

With machine learning algorithms, Google Analytics can now predict user behavior, segment audiences more effectively, and even provide automated insights that would otherwise require a dedicated analyst.

Google Analytics Features

  • Predictive metrics for user behavior, including bounce rate, time on site and user flow. 
  • Advanced audience segmentation allows you to filter and analyze segments of your users, drawing insights about high-value customers or those who perform a desired action on your site, for example. 
  • Automated insights with customizable metrics, event tracking and individual behavior data.
  • Google Trends integration to analyze seasonal shifts, popular topics and competitor statistics.

For Audio and Video: Speak

Audio and video have always been around for market researchers, and they’re increasingly showing up in content marketing. Speak is an AI tool that transcribes, analyzes and categorizes multimedia content — including interviews and focus group data. 

It’ll assess the impact of your latest video campaign. It’ll even transcribe and scrape video and audio assets to help create SEO copy across other content channels. If video is your game, Speak is your AI assistant.

Speak Features

  • AI audio and video-to-text converters , crushing hours into minutes.
  • Derive qualitative trends and patterns from unstructured data inputs, such as meetings and surveys.
  • Web scraping to analyze or summarize website pages or entire websites.

Market Research Best Practices

While AI tools can lift your market research game, you need to know your destination to steer it in the right direction. First, decide which kind of research you need:

Primary vs. Secondary Market Research

Primary research involves collecting new data directly from sources (used for UX, brand perception or product interactions). AI tools like Speak, Hotjar and Survey Monkey Genius focus on gathering primary data.

On the other hand, secondary research uses existing data, which is great for competitive analysis, segmentation and market trends. Semrush Market Explorer and Google Analytics provide secondary data analysis and are solid tools for secondary data.

Qualitative vs. Quantitative Data

Qualitative data provides insights into why people behave in certain ways, while quantitative data focuses on numbers and statistics. If you need to know who visits your site, go for qualitative. To understand how users navigate your site, go for qualitative.

Steps for Conducting Effective Market Research

  • Identify the audience: Before jumping on the data collection, know who you’re targeting. AI tools can help segment your audience based on demographics, behaviors and interests.
  • Define objectives: Are you looking to launch a new product, improve customer satisfaction or increase market share? If unsure, remember AI can help you define and refine these goals.
  • Select your research methods: Depending on your objectives, select the appropriate methods — surveys, focus groups, social listening, etc. If you haven’t already chosen an AI tool, this stage will clarify which one you need.
  • Develop a plan : A solid plan outlines your research approach, timelines and resources. AI tools can automate parts of this process, ensuring your plan is comprehensive and flexible.
  • Collect and analyze data: Here’s where you leave the heavy lifting to the robots, who will gather data from multiple sources and analyze it in real time.
  • Present findings: The final step is to present your findings in an actionable way. AI tools help create visualizations, reports and even predictive models to support your recommendations and put you in the home run.

Boost Your Research with AI Marketing Tools

Marketing couldn’t be marketing without a healthy dose of research. With the right tools, this phase becomes simpler, more accurate and tons more productive. By integrating AI into your market research process, you’re not just following the competition; you’re giving yourself the momentum to overtake.

Check out how these key players stack up against your current methods and where they can fill the gaps. Once you start using AI tools for market research, you’ll wonder how you ever got by without it.

Aleisha White

Share this article

Get our weekly newsletter

research data analysis tools

Aleisha White is a Brafton writer based in Shanghai. In her downtime, you can find her wrestling with rigging lines, musing over the cosmos and hunting for four-leaf clovers.

Recommended Reading

research data analysis tools

How To Use ChatGPT for SEO: 5 Steps for Success (Infographic)

Marketers want to know: How to use ChatGPT for SEO. Our guide tells all.

research data analysis tools

Leveraging AI for B2C Marketing: 2024 Guide

AI for B2C marketing has become a disruptive trend. What do B2C marketers need to know to stay ahead of the curve?

The Content Marketer

Get the latest content marketing updates delivered directly to your inbox with our weekly newsletter.

quote image

We Trust in Human Precision

20,000+ Professional Language Experts Ready to Help. Expertise in a variety of Niches.

API Solutions

  • API Pricing
  • Cost estimate
  • Customer loyalty program
  • Educational Discount
  • Non-Profit Discount
  • Green Initiative Discount1

Value-Driven Pricing

Unmatched expertise at affordable rates tailored for your needs. Our services empower you to boost your productivity.

PC editors choice

  • Special Discounts
  • Enterprise transcription solutions
  • Enterprise translation solutions
  • Transcription/Caption API
  • AI Transcription Proofreading API

Trusted by Global Leaders

GoTranscript is the chosen service for top media organizations, universities, and Fortune 50 companies.

GoTranscript

One of the Largest Online Transcription and Translation Agencies in the World. Founded in 2005.

Speaker 1: We're at that stage with AI tools where they want your money. They are mature enough to suck in those subscriptions. They want to get as rich as possible. But the good thing is is there are free tools that are available that are super powerful and this list contains those tools as well as ones that have really generous free plans. The first one you should know about is Heuristica or Heuristica. And the one thing I love about this is that it's AI driven and it just gives you the ability to mind map a whole kind of research field. Now what we can do, I've put in here organic photovoltaics and you can see I've got what, the cons and pros and I've also got examples down here. The way this works is that when you first log in, you put in what you want to put in this middle bit and then you can ask for elaborate. You've got all of these down the side that you can put in. So I can put in how if I click here and I click how, then it adds here. Organic photovoltaics work by converting sunlight into electricity and I can add it to my exploration and you can see here that I can add as many mind maps as possible. And I can click here and I want to know the concepts related to a specific input. So here I want solar cells, I want renewable energy, I want semiconductor, I want it all. Don't lie, beep that out. And then we've got all of these concepts that come off it and it's just so, so good. And then I can say okay, well actually I want to know about energy conversion. So what is energy conversion? And then I've got AI generating this response and I can add it in here. And this is a mind map that can just get massive. It's completely free, a really great way to start your literature review if you don't know where you want to start. It's really powerful and I think this is something that everyone should start with if they're sort of like first into a research field. Check it out. The second one you should know about is Open Read. Open Read has pricing plans for all your needs but this is the one we're interested in. We get five paper expressos, we get five paper Q&A, we get five chat with Oat, we get AI summary times five a month. So this is actually quite generous considering how powerful this is. If I go in and I look for all of my files, so I can upload here a PDF document and all I have to do here is say okay, well what PDF document do I want? What PDF document do I want? I'm going to drag and drop one of my papers right here. You can see that it's uploaded and it's got all of the things I want to know about. So it's got basic information. It's kind of given me not only just like a little summary of the thing but I've also got the DOI and other basic information. If I want to know more about this, I can ask more questions about it. This is the like little chat bar up here. Hi, I'm Oat and you can ask Oat anything and if I want to know more about this, where do I go? I can go up here and ask for a paper espresso and I love this paper espresso because what it gives you is all of the bite-sized chunks for reading about this paper. Please allow 10 to 20 seconds. Brilliant, it took less than that. I love it. We've got the achievements and significant. We've got the background and context. We've got discussions, interpretation. We get five of these a month but if you just sort of like use this on the papers that are big, that are chunky or when you're just tired and you don't want to read it. That's also completely okay. Up here, we can also ask paper Q&A, which is brilliant. We get five questions that we can ask about this paper. Who is the best author on this paper? Is it Andy Stapleton? Or is it any Stapleton? There we are. Oh, it doesn't say I'm the best, but I am the best. The third one you should know about is explain paper. I absolutely love this paper. The third one you should know about is explain paper. I absolutely love this. It's completely free. I'm amazed that it's still completely free but essentially you upload a paper to it, which is brilliant and then you just select texts that you want explained. So here, if I want this stuff, I just simply select it all. Then it pops up here and I can explain this as a middle schooler, as a five-year-old, to a high schooler, let's say middle schooler. I click explain and ultimately here, we've got a little kind of undergrad explanation of all of the things. We can ask as many follow-up questions as we want. It's even got related resources that you may want to know about as well. Really fantastic, completely free. Love it, explainpaper.com. Another tool that you should know about to help you read papers is Paper Brain. Paper Brain is really fantastic. All we have to do is upload a paper. This is one of mine, Accurate Thickness Measurement of Graphene. This is run by Dr. Cameron Shearer, fantastic scientist. I love you, Cameron. And here, I've just asked it, what is this paper about? And it's just given me a little simple like AI-generated response. A great tool for asking questions about specific papers and receiving AI-generated answers back. I'm going to ask this one, who is the best author on this? Please let it be me, please let it be me, please let it be me. Do it. Cameron Shearer is the best author on this paper. Cameron, I'm coming for you. The fifth tool that you should know about is Einblick, einblick.ai, and it's chart generation AI. Three simple steps. Upload your dataset, describe what chart you want, and then generate a chart. So it's simple. I've clicked here, try random example, because I can't be asked to upload in my own spreadsheet. But here you can see, all you have to do is put in a link to your spreadsheet, or you can upload a file. It creates a little sort of table here of all the information you put in. Then you say what chart you want to generate. In this case, they wanted a scatter plot of N2O versus CH4 admissions, and this is what they've ended up with. Brilliant. A really great way of just getting those simple first draft charts of your data that you can put in any sort of like supervisor meetings or that sort of stuff. Really powerful, really free. Check it out. Really free. More free than free. The sixth tool you should know about is Taverly. Taverly.ai or taverly.com. Here we've got an overview, but this is what I'm interested in, the research assistant. I'm going to click here, and then it's asking me, what should I research for you? So polite, these AI tools. So I want to know about organic photovoltaic devices. And what this does, which is super powerful, is it creates an agent based on your search question, and it sends it off to the internet to get information for you. This is all about making sure that AI doesn't hallucinate and come up with random bullshit that doesn't actually exist in the world. So you can see it initiated a custom research agent. Then it was looking for online sources. Then it found these things. And then it comes up with this at the bottom. It gives you a nice kind of overview of the research you've wanted. And I found that if I ask specific questions, it does give me specific answers with AI generated text that was researched properly. It gives links to all of the stuff you want to know about. And it was really fun. I've tried this one, the best organic photovoltaic devices in the current market, which is something like I need to know about if I'm in that research field. And you can see that it gave me all of this stuff saying that these guys were the best ones at the moment. Where are they? They are Dracula Technologies. And so if I click here, it will open up where they find this information. You can see it's super relevant and recent information. And this is the sort of stuff I want to know about if I'm in a particular research field. It searches the online web space. It doesn't do scholarly stuff as far as I know at the moment, but that's still super powerful for free. Go check it out. The seventh one you should know about is PowerDrill. PowerDrill has got an upgrade option, but it's really powerful in this simple format. Down here, we've got chat, but this is really easy. What you have to do is upload data sets. If I've got my research, all I need to do is go in here and upload data sources. These can be text, web pages, or files. So I'm going to put in my own research. And here we are. It says, needs larger storage size. Your current upload is two megabytes larger than data limit. That's all right. I don't want to do that. Upload that. There we are. I'm going to upload as much as I can for this data set for free. And once it's uploaded, I can go in and ask questions about all of that data set, which is super powerful and something that quite often you have to spend a lot of money to get. To make the most of this small amount of space, you want to put in review papers that contain a lot of information, and that will help you use this to its strongest ability and power. So once all our data is up there, all we need to do is go to simple chat. Then we click new chat, talk to GPT, click on this one and say, no, I want to talk to Andy's research. And then we can ask it anything about the research. Who is the best author on these papers? Don't let it be Cameron. Don't let it be Cameron. Be me, be me, be me. Oh, it hasn't given me an answer. What a cop out. Syspace, Syspace I absolutely love. Do hours worth of reading in minutes. It's brilliant. You can upload papers. You can do a literature review from Syspace papers or my library. You can extract data from PDFs. You can read with AI Copilot. There's all limits on this, but it's all quite generous. But if I go to my library, the one thing I love about this is when you upload PDFs, it gives you a too long, didn't read, gives you conclusions. It gives you all of these different things that would take you hours to extract manually from these papers by just reading them. But it's a great way to get this massive table of too long, didn't read, all of the conclusions, all of the stuff together, simple, easy. And look, all you need to do here is open Copilot. You can ask questions. It's so easy. If can't believe it's free and yeah, check it out. Syspace, but strangely their URL is typeset.io. The next one you should know about is NextNet. NextNet is fantastic if you're in the drug and health space. So here, all I need to do is click on discover, type in new drugs, for example. I don't know what I'm doing. This is well outside of my comfort zone. But it's completely free. It will go away and search for all of the recent literature and stuff on gene expressions and drugs and all that sort of stuff. But NextNet is a fantastic tool if you're in this research space. I just wish that it would have something similar in a different research space. So here we are. This tells you everything, new drugs. Then it's got national drug law, anti-tumor effects, isolates, unsafe. You know, clearly if you know what you're doing and you know the field, you can put in something to search much better. Ultimately, it gives you this super searchable graph on a research set based on your query. Brilliant, go check it out. GetNextNet.com, boom. The last ones that I want you to know about are all linked together and it's just the old school general AI chat that I've been using for, I was going to say years now, but for like a year since it came out. These are ChatGPT, the OG, the OG of AI, as well as Perplexity, as well as Bing. I use these all of the time. If we go to Bing Chat, I use these all of the time as my go-to place. They are free, they are super powerful, and you can ask it almost anything. The problem with using these over something that's specifically for research is sometimes it can hallucinate. Sometimes it can give you too broad of a response. Sometimes it's not academic enough. But these are all overcomable by chatting with it, asking for it to change its response, or for like asking it more detailed knowledge about a certain thing, or just telling it's wrong. You say like, no, I actually don't want that. I want you to do it in these ways. These three tools have changed the world and they have changed research forever. So ChatGPT, I love, that is my go-to for every day. Perplexity, I go there if I want it to give me references that I can sort of like do further information. And I go to Bing if I really want sort of like more control over the sorts of response it gives me. I can ask for more creative, I can give balance, I can ask for more precise stuff. And there are really sort of like a great number of ways you can manipulate the chat before asking the question in Bing, I love it. So those three tools are where I go all the time. They're completely free, I love them. They've changed the world, they've changed research forever. Let me know in the comments which one you prefer, ChatGPT, Perplexity, or Bing. So there we have it, there are the 12 free AI tools that are the best for 2024. Let me know in the comments if there's any I've missed or ones that you prefer, I love to know about them. And also if you love this video, go check out this one where I've put all of the five mind-blowing AI tools that you should be using for research that you probably don't know about, go check it out.

techradar

research data analysis tools

PrecisionChain Opens New Opportunities for Global Collaboration in Precision Medicine Research

Precision medicine is an approach to healthcare that tailors medical treatment and interventions to the individual characteristics of each patient, such as their genetic makeup, environment, and lifestyle. This requires combining different data types such as clinical and genetic data. Progress has been slowed by challenges like distinct data formats, strict privacy and security requirements and insufficient technology.

A new study suggests a solution called PrecisionChain, a data sharing and analysis platform built on blockchain technology. This system can securely store, harmonize, share, integrate, and analyze both genetic and clinical data. It uses a unified data model, an efficient indexing system and an end-to-end analysis pipeline to make data more accessible and usable.

research data analysis tools

“PrecisionChain is an immutable and secure system based on blockchain,” says corresponding author Gamze Gürsoy, the Herbert Irving Assistant Professor of Biomedical Informatics at Columbia University and a core faculty member at the New York Genome Center. “It keeps an immutable log of everything that is done on the data, including sharing, querying, and analyzing. This makes it easy to both track and audit, which gives institutions more confidence to share data because they can ensure it remains private and secure, and they can benefit from any discoveries made using the data.”

PrecisionChain was introduced in a Nature Medicine study published on Sept. 3, titled A framework for sharing of clinical and genetic data for precision medicine applications .

Harmonizing data for individual treatment

Both clinical and genetic data are often stored in different formats, making it difficult to combine information from various sources. PrecisionChain, in addition to providing blockchain-based decentralization, immutability, and data integrity, solves this issue by integrating well-established systems like the OMOP Common Data Model and the Variant Call Format (VCF) to harmonize the data. This allows for complex searches that combine clinical and genetic data, enabling the creation of specific patient groups based on both their genes and medical history.

“One of the big hurdles for researchers is learning how to work with many different data formats and tools,” says lead author

research data analysis tools

, a PhD student in the Columbia University Department of Biomedical Informatics. “With PrecisionChain, researchers only need to interact with one framework. We hope this reduces some of the technical barriers for conducting precision medicine research.”

By securing this data together in a single pipeline, PrecisionChain creates more opportunities to provide personalized care for patients globally.

“Precision medicine tailors treatment to the individual characteristics of each patient, which leads to more effective and targeted healthcare,” says Gürsoy, who is also a member of the Herbert Irving Comprehensive Cancer Center. “In order to comprehensively understand the characteristic of each patient, we need to look at both their clinical and genetic information, since many complex diseases are influenced by both environmental and genetic components. For collaborating institutions, it is important to increase the sample size to find connections between genetics and health outcomes.”

Ensuring security while also enabling research

Though blockchain technology is still new, it has demonstrated to provide both security and immutability to the data. However, it also comes with a series of challenges, especially when it is used to store large-scale datasets. PrecisionChain overcomes these challenges by introducing innovations in data indexing and compression to protect sensitive patient data around the world. It ensures that the data is tightly controlled, cannot be tampered with, and tracks every action on the network, making it easy to see who accessed what. This not only safeguards privacy but also supports important research by providing a clear and reliable way to manage and monitor data, while enabling multi-modal data analysis.

“PrecisionChain is designed to be easy to use, regardless of a researcher’s familiarity with blockchain technology,” Gürsoy says. “It has a simple, user-friendly interface that lets researchers access the network, search for data, and run analyses. By simplifying the complex parts and providing interactive tools like Jupyter Notebooks, PrecisionChain makes it easy for researchers to leverage blockchain’s benefits in their studies.”

Additional information

All authors (from Columbia unless noted): Ahmed Elhussein, Ulugbek Baymuradov (New York Genome Center), NYGC ALS Consortium New York Genome Center), Noémie Elhadad, Karthik Natarajan, and Gamze Gürsoy

A framework for sharing of clinical and genetic data for precision medicine applications was published Sept. 3, 2024 by Nature Medicine.

This work has been supported by the NIH grants R00HG010909 and R35GM147004.

research data analysis tools

  • Do Not Sell My Personal Info

Register For Free

  •  ⋅ 
  • Content Strategy

11 Ways To Do SEO Content Research Beyond Competitor Analysis

Take your SEO strategy to the next level with content research beyond competitor analysis. Discover how to develop unique niche strategies for long-term success.

research data analysis tools

Early SEO milestones might be easy, but scaling the results needs an upgraded approach.

What could that look like?

Like startups that come up with a solid niche idea and compete significantly with larger companies, we SEO pros and content strategists need to work harder to develop unique, fresh, niche strategies.

However, whenever we think of creating strategies, we start looking at what competitors are doing . We start feeling that we can win this game by outperforming our competitors.

Remember: we win when our focus is on winning the game and not on how to make our competitors lose.

So, here comes an upgraded approach to our SEO strategy – going beyond competitor analysis.

However, since our SEO strategies heavily rely on content, we’ll discuss content research beyond competitor analysis in this blog.

Now, What Is Content Research Beyond Competitor Analysis?

Most of us analyze our competitors to develop content ideas. It’s easy and quick.

But…

What if your competitors are ranking in the top positions but are not serving users’ intent?

What if your competitors might not be yielding enough traffic despite better rankings?

What if your competitors are driving massive organic traffic but not enough conversions?

Also, there may be some competitors that are doing extremely well regarding content KPIs serving SEO growth.

You may feel that if the competitors can achieve such results in one year, you can achieve them in six months by copying their strategies.

But that’s where you limit yourself in growth. Your competitors’ SEO and content teams might also be struggling; who knows?

This is why your content research must go beyond competitor analysis.

In this approach, we don’t look at what content competitors have written.

We don’t want to copy them or repeat their mistakes. We want to work in ways that truly resonate with our target audiences, geographies, business models, and industries.

So, the “ content research beyond competitor analysis” approach helps us bring unique and fresh perspectives to our content research, creating incredible value for our audience and clients and scaling our SEO results extensively.

11 Ways Of Content Research Beyond Competitor Analysis To Scale SEO ROI

We have 11 ways to use this approach. Let’s uncover them one by one with step-by-step processes and examples.

1. Use Semrush

This is our basic step of content research since most of our initial goal is driving organic traffic.

And because Semrush is handy for most of our team members at Missive Digital, we log in immediately to start our content research instead of looking at competitors.

We put seed, actual, long-tail, and more keywords to do our content research, depending on the search volume, keyword difficulty, and search intent.

For example, we have put “diamond jewelry” into Semrush and will add the filters according to our SEO strategy.

Screenshot from Semrush

Another content research feature of Semrush that we use extensively is Topic Research. We choose the content topics based on which ones relate directly or indirectly to our website.

Screenshot from Semrush

2. Use Ahrefs

To do the content research on Ahrefs, we follow the same steps as Semrush, but here, we also use Content Explorer.

We filter based on the Page Traffic and reference domains to identify queries that can bring us traffic and conversions.

Screenshot from Ahrefs 2024

Then, we also examine the frequency of republishing, which gives our team an idea of when to schedule it next for content optimization, considering the performance.

Screenshot from Ahrefs, August 2024

3. Use Google News

While auditing the content, if we realize that the client is already writing a lot of content, we try researching content ideas through Google News.

Also, for some D2C industries like jewelry, the trend also comes from celebrities wearing them – so we keep a close eye on Google News.

research data analysis tools

For example, the screenshot below shows a ‘B’ necklace worn by Selena Gomez in reference to her boyfriend.

Screenshot from search for [diamond necklace]

Bingo! Now, we have to discuss with the client’s team for our next content piece.

4. Use People Also Ask, AlsoAsked

Since most B2B IT and SaaS clients are highly technical, we sometimes struggle to understand the topic and create a content strategy.

People Also Ask on Google Search and AlsoAsked.com by Mark Williams-Cook works like a savior during our content research.

Screenshot from search for [kubernetes architecture]

5. Check Google Trends

No matter what industry you are in , you’ve got something or the other trending.

In our SEO industry, SearchGPT is trending.

Google Trends

So it’s worth writing about it to take the early advantage and grab the traffic share.

See, a lot of people are writing about it:

Screenshot from search for [searchgpt]

So, it’s worth constantly watching what’s trending via Google Trends.

6. Hop On ChatGPT Or Gemini

Remember, we are here to do content research on ChatGPT or Gemini, not to choose the titles they suggest.

Here is a sample content research prompt that we have put for a contact center software company on ChatGPT:

Screenshot from ChatGPT

And here are the responses below:

Screenshot from ChatGPT, results for blog posts

Since the topics are not up to the mark considering the audience (“BPO” in this case), based on the above content ideas, we’ll pick up the seed keywords or topics such as:

  • Lessons from a Legacy Contact Center Software Company.
  • The Contact Center Software Market In The BPO Segment.
  • Optimizing Your Contact Center Operations.
  • How to Drive Innovation in Your Customer Support Department?

7. Monitor Social Media

Yes, we are all active on social media, so we can use it for our content research. Still, we are not considering competitors on social media at the moment.

This viral X thread inspired us to write a blog:

X tweet

Similarly, this article on content research was inspired by my recent post on LinkedIn :

LinkedIn

These are examples of self-created social media content that can be turned into blogs.

However, you can keep monitoring the types of content that get the most visibility and engagement on social media – be it LinkedIn, Instagram, X, or any other platform.

Turn them into your blogs or webinars, but don’t forget to mention them since it’s their original content idea.

8. Dive Into Industry-specific Research Studies

The most unique way to research content ideas is to read your industry-specific research studies extensively. And there’s no one way to do it.

For example, for one of the ecommerce consulting companies, we can get various content ideas from HBR’s eCommerce pricing test :

  • Why Should Ecommerce Brands Stop Offering Free Shipping?
  • X Benefits of No Free Shipping or Conditional Shipping.
  • Free Shipping vs. Conditional Shipping.

Don't just give away free shipping: infographic from HBR

In the below study by Broadridge on Digital Transformation , the below can become the topic clusters, and each can have its own spoke-like content topics.

Image from study by Broadridge on Digital Transformation

For example, if we take Unleashing Artificial Intelligence, we can pick up so many topics out of just one graphic:

Priority areas for AI investments

9. Check Industry-Specific Forums/Communities

Most of our clientele includes IT companies, and we have used IT forums and communities like StackOverflow for content research.

For example, we can come up with the below topic clusters when covering Flutter for the non-technical and technical target audiences:

  • Flutter animation widgets.
  • Flutter dependency management.
  • Why add Firebase to your Flutter app?

Stack Overflow

Similarly, there will be many such forums or communities of your client or employers available to peek into for such content ideas, except for competitive analysis.

10. Google site:reddit.com “my topic”

One such unique idea by Kunjal Chawhan is to Google site:reddit.com “my topic ,”  and let’s see what content ideas look like for a couple of topics:

Looking at the above screenshot, below are the topics that we can definitely create:

  • X Most Popular Social Media Platforms for Ecommerce.
  • How to Use Video Podcasts to Drive Ecommerce Sales?
  • How to Boost Ecommerce Sales When Digital Marketing Seems Expensive?

So yes, Kunjal’s way of content research is amazing, and from that, you can similarly Google:

site:“your industry’s leading site” “topic”

For example:

  • site:searchenginejournal.com “ai content”
  • site:quora.com “ai content”
  • site:practicalecommerce.com “sales”

Let’s move on to the last but not the least method of content research, except for looking at competitors.

11. See What Competing Sites Have NOT Covered

Now you might wonder, “Weren’t the above content research ways except for competitors analysis?”

Yes, they are the ways to research content ideas except for what competitors have written.

But here, I’m trying to make a point where you have to see exactly what indirect competing sites are NOT writing about despite targeting the same industry, keyword clusters, and audience.

What is an indirect competing site?

An indirect competing site is a website that ranks for the industry and search queries of your target audience but is not exactly your product/service competitor. This can be a marketplace, publishing site, or product review site.

Let’s take a website, “leadsquared.com,” for indirect competitive analysis and pick the queries that rank after 50th positions and have a keyword difficulty of less than 29.

Pick those queries and search on Google: site:leadsquared.com “sales funnel vs sales pipeline” .

site:leadsquared.com

In short, you can cover the below topics:

  • Sales funnel vs. sales pipeline.
  • Sales funnel vs. marketing funnel.
  • Sales funnel vs. flywheel.

Just ensure these content topics align with your offerings to bring maximum ROI.

How Will Content Research Beyond Competitor Analysis Contribute To SEO Efforts?

When you go beyond competitor analysis for content research, you discover a few benefits:

  • You innovate – With innovative content ideas, you can experiment and build better strategies that can bring unbelievable results. Also, with AI taking the space predominantly, businesses are looking for innovation in their business and marketing. So when you innovate, you may get better attention and even resources.
  • You get niche opportunities – Instead of just focusing on what competitors are doing, you go deeper into understanding your target audience and explore new content ideas that your competitors might have missed. In such scenarios, you get better results since competition is reduced.
  • You create unique, audience-specific content – My LinkedIn post saw great engagement because it resonated with its audience. This opened us to something unique and specific to the pain point of SEOs and content strategists: content ideation to scale SEO results with a not-so-usual approach. Such content helps us build authority in the market, which is essential today to becoming market leaders.
  • You capitalize on emerging trends – Being an early adopter of something has huge potential for success. When you create your content strategy focused on what’s new or trending in your industry before it becomes mainstream, you get the most eyes right from the beginning and even repeat eyes going forward.
  • You build better engagement and loyalty – You can extend beyond blogs, a traditional way of driving SEO results. Videos, whitepapers, case studies, user-generated content, and many more content formats can take the lead in building user engagement and brand loyalty through SEO.
  • You earn backlinks – Yes, such unique content may require less effort to build backlinks since it can earn them.

Stop looking at competitors for content research; try using these fresh and unique ways to drive better content ROI.

Just remember two things: Competitors are not always right, and you are not necessarily required to look upon them when developing your SEO content strategies.

You can copy and paste your competitors’ strategies to achieve certain SEO milestones, but creating history requires an upgraded approach. What say?

More resources: 

  • SEO Competitive Analysis: The Definitive Guide
  • Leveraging Generative AI Tools For SEO
  • Perfectly Optimized Content From Start To Finish

Featured Image: Natalya Kosarevich /Shutterstock

Himani is the Founder of Missive Digital, an organic marketing agency that focuses on enhancing the brand positioning of the ...

  • Search Research
  • Eindhoven Artificial Intelligence Systems Institute
  • Institute for Complex Molecular Systems
  • Eindhoven Hendrik Casimir Institute
  • Eindhoven Institute for Renewable Energy Systems
  • Artificial Intelligence
  • Smart Mobility
  • Engineering Health
  • Integrated Photonics
  • Quantum Technology
  • High Tech Systems Center
  • Data Science
  • Humans and Technology
  • Future Chips
  • Research Groups
  • Other labs and facilities
  • Researchers
  • Applied Physics and Science Education
  • Biomedical Engineering
  • Built Environment
  • Chemical Engineering and Chemistry
  • Eindhoven School of Education
  • Electrical Engineering
  • Industrial Design
  • Industrial Engineering and Innovation Sciences
  • Mathematics and Computer Science
  • Mechanical Engineering
  • National Grants
  • International Grants
  • TU/e Distinctions
  • Sectorplans
  • Research assessments
  • Winners TU/e Science Awards
  • Research Support Network

Information Systems IE&IS

The Information Systems (IS) group studies novel tools and techniques that help organizations use their information systems to support better operational decision making.

research data analysis tools

Create value through intelligent processing of business information

Information Systems are at the core of modern-day organizations. Both within and between organizations. The Information Systems group studies tools and techniques that help to use them in the best possible way, to get the most value out of them.

In order to do that, the IS group helps organizations to: (i) understand the business needs and value propositions and accordingly design the required business and information system architecture; (ii) design, implement, and improve the operational processes and supporting (information) systems that address the business need, and (iii) use advanced data analytics methods and techniques to support decision making for improving the operation of the system and continuously reevaluating its effectiveness.

We do so in various sectors from transportation and logistics, mobility services, high-tech manufacturing, service industry, and e-commerce to healthcare.

Against this background, IS research concentrates on the following topics:

  • Business model design and service systems engineering for digital services.
  • Managing digital transformation.
  • Data-driven business process engineering and execution.
  • Innovative process modeling techniques and execution engines.
  • Human aspects of information systems engineering.
  • Intelligent decision support through Artificial Intelligence and Computational Intelligence.
  • Data-driven decision making.
  • Machine learning to optimize resource allocation.
  • All IS news

research data analysis tools

Research Areas

We work on Information Systems topics in three related research areas.

Process Engineering

Process Engineering (PE) develops integrated tools and techniques for data-driven decision support in the design and execution of…

AI for decision-making

AI for Decision-Making (AI4DM) develops methods, techniques and tools for AI-driven decision making in operational business process.

Business Engineering

Business Engineering (BE) investigates and develops new concepts, methods, and techniques - including novel data-driven approaches - for the…

Application domains

We focus on the application of Information Systems in the following domains.

Information Systems are the backbone of modern health(care) ecosystems. They are critical for clinical research, clinical operations, and…

Smart Industry

The digital transformation of industry is leveraged by Information Systems providing integrated data and process management and AI-enabled…

Transportation and Logistics

Information Systems facilitate monitoring and planning of transportation and logistics resources. By doing so, they ultimately help to…

Information Systems focuses on the business architecture design of new mobility solutions that are safe, efficient, affordable and…

Service Industry

Service organizations, including banks, insurance companies, and governmental bodies, fully rely on information provisioning to do their…

Meet some of our researchers

Sybren de kinderen, isel grau garcia, yingqian zhang, laura genga, pieter van gorp, konstantinos tsilionis, remco dijkman, baris ozkan, karolin winter, oktay türetken, laurens bliek, alexia athanasopoulou.

  • Meet all our researchers

human centric AI

ENFIELD & EAISI event: Human Centric AI

Together with EAISI, ENFIELD will present key findings on ongoing projects, available funding for researchers and collaboration…

research data analysis tools

EAISI lecture of Visiting Professor Chiara Ghidini

Process, Data, Conceptual Knowledge, and AI: What can they do together? Chiara Ghidini is a full professor at the Free University of…

valorization

Annual AI Conference ELA - Siemens 2024

The Euregio AI Triangle (RWTH Aachen, KU Leuven and TU Eindhoven) and Siemens are cordially inviting all AI enthusiasts and interested…

Recent Publications

  • See all publications

Our most recent peer reviewed publications

Acceptance of Mobility-as-a-Service: Insights from empirical studies on influential factors

A revised cognitive mapping methodology for modeling and simulation, topic specificity, business models and process models, a reference architecture for reverse logistics in the high-tech industry.

research data analysis tools

Open source

We encourage innovation from our research. This is why we share the open-source codes from our research projects.

  • Link to our open source codes

Work with us!

Please check out the TU/e vacancy pages for opportunities within our group. 

If you are a student, potential sponsor or industrial partner and want to work with us, please contact the IS secretariat or the Information Systems group chair,  dr.ir. Remco Dijkman

Visiting address

Postal address.

More From Forbes

The “hidden” ai tools that are driving educational gains.

  • Share to Facebook
  • Share to Twitter
  • Share to Linkedin

There’s more to AI than ChatGPT and bot-generated worksheets.

Recently I watched a focus group of seven education policy professionals discuss their views on schools and artificial intelligence. They were smart, engaged people, and their politics ranged from extremely conservative to very liberal. But when it came to AI, they all held similar views: very cautious optimism tempered with a heavy dose of skepticism.

One participant’s opinion seemed to encapsulate the group’s thinking overall: “Why can’t we just stick with the Socratic method? It’s been around for 2,000 years. Do we need to replace that with something else? I don't know.”

After listening to this group for more than an hour, I came away with two big takeaways: First, nobody wanted AI to replace teachers. This group understood – as do I – that authentic teacher-to-student interaction is where real learning occurs, and they did not like the idea of AI replacing human-to-human contact. Rather, they were open to AI-driven tools that could make teaching easier and learning more individualized.

Second, few of the participants could envision what such an AI tool might look like. Nor could they concretely describe how AI would support educators in the ways they imagined. It was all fuzzy to them.

I suspect this is also true of most parents, who, according to a recent survey , overwhelmingly think that AI is a valuable educational tool despite only 40 percent of them having used it with their kids.

Google Chrome 3-Week Update Deadline—New Warning To Change Your Browser

Today’s new moon sets up a ‘supermoon eclipse’ and a ‘ring of fire’, today’s nyt mini crossword clues and answers for tuesday, september 3.

So, there seems to be an information void among a lot of parents and even among those who work in education – they recognize the potential of AI to improve learning — but they don’t know what that means, or looks like, in practice.

I think I can help with a real-life example: the high-impact tutoring provided by Saga Education .

For many, tutoring can seem relatively easy, but it's actually quite difficult – and expensive – at scale. Some estimates put the cost of intensive high-impact tutoring at $1,500 to $2,500 per student . So if a school has 250 students who would benefit from tutoring, it could cost upwards of $500,000 to service them.

For most schools, that kind of price tag is simply out of reach, despite convincing research showing that tutoring is one of the most effective ways to improve student achievement. The thing is, tutoring only works if the tutors know the content and are good at what they do. Pairing kids up with adults who aren't trained tutors, or with adults who have little knowledge about how students learn, often doesn't lead to learning gains. But when the tutor is skilled and adept at working with students, the gains can be extraordinary.

And that’s where Saga comes in. It is well-known in education circles, and its high-impact tutoring model is heavily researched .

Saga recently began using an AI-powered platform to help train tutors to be more effective. The platform was co-developed by researchers at the University of Colorado who are part of the Learning Engineering Virtual Institute – a seven-team effort across multiple organizations to drastically improve math outcomes in U.S. middle schools. I’m familiar with Saga because my organization, The Learning Agency , helps administer the virtual institute.

Research has shown that how tutors talk and interact with students significantly affects whether the students improve. There are a number of key “talk moves” tutors can make to maximize their impact with kids.

One technique is called “pressing for accuracy,” which simply means having the tutor ask the student to explain the concept or idea they just discussed. In other words, did the tutor check to make sure the student “got it”? Another successful tactic is called “pressing for reasoning” – prompting students to share their thinking behind an answer.

For example, a tutor might ask a question that gets the student to contribute to the conversation, such as “Can you give me an example of …?” Or, they might ask the student to explain their thinking, “Why did you use this approach to solve this problem?”

Tutors who use techniques like these get better results than tutors who don’t. With that in mind, here’s how AI works to help tutors be better.

After tutors work with students on the Saga platform, the AI analyzes the conversations that took place. The AI notes who did the talking, what they said, the quality of the conversations based on the various talk moves, and the questions that were asked (and how they were answered). Then, the platform creates several visualizations of its analysis, creating a timeline of when “talk moves” occurred, charts showing the frequency of the different “talk moves,” and other data points, like who did most of the talking – the student or the tutor – or whether the tutor allowed the student to give mostly one-word answers.

These AI-generated reports give tutors direct and specific feedback on how they can improve. It would take a coach or teacher about an hour and a half to watch and annotate a 30-minute tutoring session to provide the same level of constructive feedback. The Saga platform allows them to do it in about 30 minutes.

The efficiencies gained by using the platform allow one coach or teacher to supervise many tutors effectively, thereby increasing the number of students receiving high-dose, high-quality tutoring while simultaneously controlling costs.

This is the real promise of AI in education – giving educators the data, analysis, and tools they need to be the best teachers they can be. It’s also a technology that most parents would never see, even though their children might be benefiting from it tremendously.

So the next time you see an article or a report expressing skepticism or doubt about the role of AI in education, remind yourself that there’s more to AI than ChatGPT and bot-generated worksheets. In many ways, the “hidden” tools are the real future of AI in schools.

Ulrich Boser

  • Editorial Standards
  • Reprints & Permissions

Join The Conversation

One Community. Many Voices. Create a free account to share your thoughts. 

Forbes Community Guidelines

Our community is about connecting people through open and thoughtful conversations. We want our readers to share their views and exchange ideas and facts in a safe space.

In order to do so, please follow the posting rules in our site's  Terms of Service.   We've summarized some of those key rules below. Simply put, keep it civil.

Your post will be rejected if we notice that it seems to contain:

  • False or intentionally out-of-context or misleading information
  • Insults, profanity, incoherent, obscene or inflammatory language or threats of any kind
  • Attacks on the identity of other commenters or the article's author
  • Content that otherwise violates our site's  terms.

User accounts will be blocked if we notice or believe that users are engaged in:

  • Continuous attempts to re-post comments that have been previously moderated/rejected
  • Racist, sexist, homophobic or other discriminatory comments
  • Attempts or tactics that put the site security at risk
  • Actions that otherwise violate our site's  terms.

So, how can you be a power user?

  • Stay on topic and share your insights
  • Feel free to be clear and thoughtful to get your point across
  • ‘Like’ or ‘Dislike’ to show your point of view.
  • Protect your community.
  • Use the report tool to alert us when someone breaks the rules.

Thanks for reading our community guidelines. Please read the full list of posting rules found in our site's  Terms of Service.

States Improve How They Assess Coastal Wetlands’ Impacts to Reduce Climate Pollution

Updated smithsonian-led ‘blue carbon report card’ measures efforts to quantify how much—and how effectively—tidal wetlands store carbon.

Navigate to:

  • Table of Contents

Two people stand and another person sits on a platform collecting samples for scientific research above a lush green salt marsh against the backdrop of a degrading coastal forest and a brilliant blue sky.

As coastal states grapple with the best nature-based solutions to reduce the effects of climate change on their residents and economies, organizations are developing tools to help assess and quantify the role that “ blue carbon” habitats play in this effort.

A woman wearing a light blue shirt and brown pants sits on a pile of branches in a wooded area. A red backpack rests on one shoulder, and a yellow bandana is tied around her neck.

Although nothing will slow global climate change faster than reducing overall greenhouse gas emissions, boosting blue carbon—the atmospheric carbon dioxide that habitats such as seagrasses and salt marshes absorb and sequester—can make a difference. When healthy and left undisturbed, the roots, stems, and soils of these ecosystems are remarkably efficient at storing and accumulating carbon over centuries. Blue carbon habitats—which are often culturally and historically significant places—also provide homes and breeding grounds for fish, birds, and other wildlife; opportunities for recreational and economic development; and protection from floods and severe storms.

With an eye toward these benefits, some states are developing ways to keep existing tidal wetlands healthy and intact, and to restore degraded habitats. A challenging part of this equation is understanding how much carbon is stored—and how much more could be stored—in wetlands. Enter the Smithsonian Environmental Research Center (SERC). Headquartered in Maryland along the Chesapeake Bay—the nation’s largest and the world’s third-largest estuary —the center directs coastal ecosystems research that can lead to policies and practices supporting a more sustainable planet.

“Wetlands are pulling a lot of weight in mitigating climate change, especially given the relatively small amount of land they occupy on the planet,” said Jaxine Wolfe, a research technician with the center. “That means we can leverage these ecosystems to mitigate the effects of climate change. You can make a difference by conserving wetlands or restoring them.”

Read Pew partner U.S. Nature 4 Climate’s Q&A with Jaxine Wolfe , who coordinated the Smithsonian Environmental Research Center’s blue carbon report card.

‘Report card’ showed improvements for most coastal states

Led by Jim Holmquist, the center’s wetland ecologist, Wolfe and fellow data technicians Rose Cheney and Henry Betts developed and updated one of SERC’s defining blue carbon projects: the Coastal Carbon Atlas and Library, a digital compilation of global blue carbon data. From that data, the center in 2021 developed a state-level “blue carbon report card” for the 23 coastal states, a Pew Charitable Trusts-funded initiative that provided composite scores of soil carbon data for each of the states examined, based on four metrics:

  • Data quantity (the number of “cores”—or soil samples—compared with total coastal wetlands in the state).
  • Data quality (how well the cores assess blue carbon).
  • Spatial representation (how well dispersed sampling efforts are across the state’s coastal wetlands).
  • Habitat representation (how well habitats sampled match their estimated area in the state).

The highest-ranking states across composite scores in the inaugural 2021 report card were Louisiana, Massachusetts, Oregon, and Washington. The research also concluded that at least five states lagged in the quantity and quality of their data in SERC’s database: Maine, Maryland, New Jersey, New York, and Virginia.

research data analysis tools

Recognizing that “collaboration between researchers and networks to increase data access is really important,” Wolfe said , SERC in July 2023 launched a data stewardship effort to help researchers across the country submit coastal carbon data to the atlas, with a primary goal of bolstering data from those underrepresented states.

This led to an updated report card , released in mid-June 2024, that showed that most states were rated at least “fair” across all metrics—an improvement over 2021—and that most improved across several parameters.

Of the 23 coastal states, 21—or 91%—improved the quantity of their data, while 16—or 70%—improved their data’s quality. Among the “most improved” states, Alabama, Maine, and New Hampshire scored the greatest gains in data quantity, while Alabama, Mississippi, and Oregon registered the strongest advances in data quality.

Three factors that contributed to the improvements:

  • Establishing agreements and protocols to share data among state agencies, academic institutions, nongovernmental organizations, and private sector entities involved in blue carbon research and monitoring.
  • Developing and standardizing how data is collected, managed, and reported.
  • Asking the Coastal Carbon Network, a consortium of coastal land managers and researchers aimed at accelerating the pace of coastal wetland science, and other federal and regional initiatives to provide technical assistance, training, and funding that support state- and federal-level blue carbon data needs and priorities.

Better and expanded data leads to greater accuracy and applicability

These efforts resulted in more expansive and accurate data that helps states better understand how much carbon their coastal wetlands store, which in turn enables officials to set measurable conservation and restoration goals for these habitats.

Alongside the updated report card, SERC has connected data in the atlas to carbon accumulation rates, or measurements over time of how much carbon dioxide is captured from the atmosphere and stored as blue carbon. This information can help states and federal agencies understand the extent to which tidal wetlands function as “carbon sinks”—places that capture and store more carbon than is released—as well as how land use activities that destroy wetlands may impact carbon storage, and where restoration of wetlands could deliver climate benefits.

Coastal carbon stocks and stores are analogous to money in a bank account,” Wolfe said. “Wetlands capture and store blue carbon similar to the way people deposit money into banks. Blue carbon accumulation rates are like the interest on those bank deposits. By protecting and allowing these ecosystems to mature, they do the work of accumulating carbon each year, just like bank account interest, providing more and greater benefits to the environment.

Taken together, the atlas and the accumulation rate data can help states measure and improve how they manage their tidal wetlands, with an eye toward maintaining and expanding these natural carbon sinks and reducing the overall effects of climate change.

Alex Clayton Moya is an officer with Pew’s U.S. conservation project.

Alex Clayton Moya

Get States of Nature, our U.S. conservation and communities newsletter every month.

Scientist measuring water depth

ADDITIONAL RESOURCES

MORE FROM PEW

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Published: 29 August 2024

Bioinformatics advances in eccDNA identification and analysis

  • Fuyu Li   ORCID: orcid.org/0009-0000-5410-3556 1 ,
  • Wenlong Ming   ORCID: orcid.org/0000-0002-5553-905X 2 ,
  • Wenxiang Lu 1 ,
  • Ying Wang 1 ,
  • Xianjun Dong   ORCID: orcid.org/0000-0002-8052-9320 3 , 4 &
  • Yunfei Bai   ORCID: orcid.org/0000-0002-4088-4117 1  

Oncogene ( 2024 ) Cite this article

Metrics details

  • DNA recombination

Extrachromosomal circular DNAs (eccDNAs) are a unique class of chromosome-originating circular DNA molecules, which are closely linked to oncogene amplification. Due to recent technological advances, particularly in high-throughput sequencing technology, bioinformatics methods based on sequencing data have become primary approaches for eccDNA identification and functional analysis. Currently, eccDNA-relevant databases incorporate previously identified eccDNA and provide thorough functional annotations and predictions, thereby serving as a valuable resource for eccDNA research. In this review, we collected around 20 available eccDNA-associated bioinformatics tools, including identification tools and annotation databases, and summarized their properties and capabilities. We evaluated some of the eccDNA detection methods in simulated data to offer recommendations for future eccDNA detection. We also discussed the current limitations and prospects of bioinformatics methodologies in eccDNA research.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 50 print issues and online access

251,40 € per year

only 5,03 € per issue

Buy this article

  • Purchase on SpringerLink
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

research data analysis tools

Similar content being viewed by others

research data analysis tools

eccDNAdb: a database of extrachromosomal circular DNA profiles in human cancers

research data analysis tools

Circlehunter: a tool to identify extrachromosomal circular DNA from ATAC-Seq data

research data analysis tools

scCircle-seq unveils the diversity and complexity of extrachromosomal circular DNAs in single cells

Data availability.

The simulation pipeline and all scripts for eccDNA detection are available on GitHub ( https://github.com/FuyuLi/Review_bioinformatics_in_eccDNA ). The simulated datasets generated for evaluation are available from the corresponding authors and https://pan.seu.edu.cn:443/link/0300E3FCD85AA52A4301691B68BA318C .

Gaubatz JW. Extrachromosomal circular DNAs and genomic sequence plasticity in eukaryotic cells. Mutat Res. 1990;237:271–92.

Article   CAS   PubMed   Google Scholar  

Paulsen T, Kumar P, Koseoglu MM, Dutta A. Discoveries of Extrachromosomal Circles of DNA in Normal and Tumor Cells. Trends Genet. 2018;34:270–8.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Hotta Y, Bassel A. Molecular Size And Circularity Of Dna In Cells Of Mammals And Higher Plants. Proc Natl Acad Sci USA. 1965;53:356–62.

Cox D, Yuncken C, Spriggs AI. Minute Chromatin Bodies In Malignant Tumours Of Childhood. Lancet. 1965;1:55–8.

Cowell JK. Double minutes and homogeneously staining regions: gene amplification in mammalian cells. Annu Rev Genet. 1982;16:21–59.

Cohen S, Yacobi K, Segal D. Extrachromosomal circular DNA of tandemly repeated genomic sequences in Drosophila. Genome Res. 2003;13:1133–45.

Møller HD, Parsons L, Jørgensen TS, Botstein D, Regenberg B. Extrachromosomal circular DNA is common in yeast. Proc Natl Acad Sci USA. 2015;112:E3114–22.

Article   PubMed   PubMed Central   Google Scholar  

Stanfield SW, Helinski DR. Multiple mechanisms generate extrachromosomal circular DNA in Chinese hamster ovary cells. Nucleic Acids Res. 1986;14:3527–38.

Lanciano S, Carpentier MC, Llauro C, Jobet E, Robakowska-Hyzorek D, Lasserre E, et al. Sequencing the extrachromosomal circular mobilome reveals retrotransposon activity in plants. PLoS Genet. 2017;13:e1006630.

Noer JB, Hørsdal OK, Xiang X, Luo Y, Regenberg B. Extrachromosomal circular DNA in cancer: history, current knowledge, and methods. Trends Genet. 2022;38:766–81.

Bailey C, Shoura MJ, Mischel PS, Swanton C. Extrachromosomal DNA-relieving heredity constraints, accelerating tumour evolution. Ann Oncol. 2020;31:884–93.

Wu S, Bafna V, Chang HY, Mischel PS. Extrachromosomal DNA: An Emerging Hallmark in Human Cancer. Annu Rev Pathol. 2022;17:367–86.

Yi E, Chamorro González R, Henssen AG, Verhaak RGW. Extrachromosomal DNA amplifications in cancer. Nat Rev Genet. 2022;23:760–71.

Verhaak RGW, Bafna V, Mischel PS. Extrachromosomal oncogene amplification in tumour pathogenesis and evolution. Nat Rev Cancer. 2019;19:283–8.

van Leen E, Brückner L, Henssen AG. The genomic and spatial mobility of extrachromosomal DNA and its implications for cancer therapy. Nat Genet. 2022;54:107–14.

Article   PubMed   Google Scholar  

Li Z, Wang B, Liang H, Han L. Pioneering insights of extrachromosomal DNA (ecDNA) generation, action and its implications for cancer therapy. Int J Biol Sci. 2022;18:4006–25.

Bafna V, Mischel PS. Extrachromosomal DNA in Cancer. Annu Rev Genomics Hum Genet. 2022;23:29–52.

Hung KL, Mischel PS, Chang HY. Gene regulation on extrachromosomal DNA. Nat Struct Mol Biol. 2022;29:736–44.

Jiang R, Yang M, Zhang S, Huang M. Advances in sequencing-based studies of microDNA and ecDNA: Databases, identification methods, and integration with single-cell analysis. Comput Struct Biotechnol J. 2023;21:3073–80.

Carroll SM, DeRose ML, Gaudray P, Moore CM, Needham-Vandevanter DR, Von Hoff DD, et al. Double minute chromosomes can be produced from precursors derived from a chromosomal deletion. Mol Cell Biol. 1988;8:1525–33.

CAS   PubMed   PubMed Central   Google Scholar  

Von Hoff DD, Needham-VanDevanter DR, Yucel J, Windle BE, Wahl GM. Amplified human MYC oncogenes localized to replicating submicroscopic circular DNA molecules. Proc Natl Acad Sci USA. 1988;85:4804–8.

Article   Google Scholar  

deCarvalho AC, Kim H, Poisson LM, Winn ME, Mueller C, Cherba D, et al. Discordant inheritance of chromosomal and extrachromosomal DNA elements contributes to dynamic disease evolution in glioblastoma. Nat Genet. 2018;50:708–17.

Shoshani O, Brunner SF, Yaeger R, Ly P, Nechemia-Arbely Y, Kim DH, et al. Chromothripsis drives the evolution of gene amplification in cancer. Nature. 2021;591:137–41.

Rosswog C, Bartenhagen C, Welte A, Kahlert Y, Hemstedt N, Lorenz W, et al. Chromothripsis followed by circular recombination drives oncogene amplification in human cancer. Nat Genet. 2021;53:1673–85.

Shibata Y, Kumar P, Layer R, Willcox S, Gagan JR, Griffith JD, et al. Extrachromosomal microDNAs and chromosomal microdeletions in normal tissues. Science. 2012;336:82–6.

Mansisidor A, Molinar T Jr, Srivastava P, Dartis DD, Pino Delgado A, Blitzblau HG, et al. Genomic Copy-Number Loss Is Rescued by Self-Limiting Production of DNA Circles. Mol Cell. 2018;72:583–93.e4.

Zhang CZ, Spektor A, Cornils H, Francis JM, Jackson EK, Liu S, et al. Chromothripsis from DNA damage in micronuclei. Nature. 2015;522:179–84.

Van Roy N, Vandesompele J, Menten B, Nilsson H, De Smet E, Rocchi M, et al. Translocation-excision-deletion-amplification mechanism leading to nonsyntenic coamplification of MYC and ATBF1. Genes Chromosomes Cancer. 2006;45:107–17.

Carroll SM, Gaudray P, De Rose ML, Emery JF, Meinkoth JL, Nakkim E, et al. Characterization of an episome produced in hamster cells that amplify a transfected CAD gene at high frequency: functional evidence for a mammalian replication origin. Mol Cell Biol. 1987;7:1740–50.

Storlazzi CT, Lonoce A, Guastadisegni MC, Trombetta D, D’Addabbo P, Daniele G, et al. Gene amplification as double minutes or homogeneously staining regions in solid tumors: origin and structure. Genome Res. 2010;20:1198–206.

Koche RP, Rodriguez-Fos E, Helmsauer K, Burkert M, MacArthur IC, Maag J, et al. Extrachromosomal circular DNA drives oncogenic genome remodeling in neuroblastoma. Nat Genet. 2020;52:29–34.

Smith CA, Vinograd J. Small polydisperse circular DNA of HeLa cells. J Mol Biol. 1972;69:163–78.

Nosek J, Dinouël N, Kovac L, Fukuhara H. Linear mitochondrial DNAs from yeasts: telomeres with large tandem repetitions. Mol Gen Genet. 1995;247:61–72.

Dillon LW, Kumar P, Shibata Y, Wang YH, Willcox S, Griffith JD, et al. Production of Extrachromosomal MicroDNAs Is Linked to Mismatch Repair Pathways and Transcriptional Activity. Cell Rep. 2015;11:1749–59.

Cohen S, Regev A, Lavi S. Small polydispersed circular DNA (spcDNA) in human cells: association with genomic instability. Oncogene. 1997;14:977–85.

Nabetani A, Ishikawa F. Unusual telomeric DNAs in human telomerase-negative immortalized cells. Mol Cell Biol. 2009;29:703–13.

Turner KM, Deshpande V, Beyter D, Koga T, Rusert J, Lee C, et al. Extrachromosomal oncogene amplification drives tumour evolution and genetic heterogeneity. Nature. 2017;543:122–5.

Wang Y, Wang M, Djekidel MN, Chen H, Liu D, Alt FW, et al. eccDNAs are apoptotic products with high innate immunostimulatory activity. Nature. 2021;599:308–14.

Prada-Luengo I, Krogh A, Maretty L, Regenberg B. Sensitive detection of circular DNAs at single-nucleotide resolution using guided realignment of partially aligned reads. BMC Bioinforma. 2019;20:663.

Article   CAS   Google Scholar  

Henriksen RA, Jenjaroenpun P, Sjøstrøm IB, Jensen KR, Prada-Luengo I, Wongsurawat T, et al. Circular DNA in the human germline and its association with recombination. Mol Cell. 2022;82:209–17.e7.

Møller HD, Mohiyuddin M, Prada-Luengo I, Sailani MR, Halling JF, Plomgaard P, et al. Circular DNA elements of chromosomal origin are common in healthy human somatic tissue. Nat Commun. 2018;9:1069.

Kim H, Nguyen NP, Turner K, Wu S, Gujar AD, Luebeck J, et al. Extrachromosomal DNA is associated with oncogene amplification and poor outcome across multiple cancers. Nat Genet. 2020;52:891–7.

Chen JP, Diekmann C, Wu H, Chen C, Della Chiara G, Berrino E, et al. scCircle-seq unveils the diversity and complexity of extrachromosomal circular DNAs in single cells. Nat Commun. 2024;15:1768.

Wu S, Turner KM, Nguyen N, Raviram R, Erb M, Santini J, et al. Circular ecDNA promotes accessible chromatin and high oncogene expression. Nature. 2019;575:699–703.

Zhao XK, Xing P, Song X, Zhao M, Zhao L, Dang Y, et al. Focal amplifications are associated with chromothripsis events and diverse prognoses in gastric cardia adenocarcinoma. Nat Commun. 2021;12:6489.

Morton AR, Dogan-Artun N, Faber ZJ, MacLeod G, Bartels CF, Piazza MS, et al. Functional Enhancers Shape Extrachromosomal Oncogene Amplifications. Cell. 2019;179:1330–41.e13.

Helmsauer K, Valieva ME, Ali S, Chamorro González R, Schöpflin R, Röefzaad C, et al. Enhancer hijacking determines extrachromosomal circular MYCN amplicon architecture in neuroblastoma. Nat Commun. 2020;11:5823.

Zhu Y, Gujar AD, Wong CH, Tjong H, Ngan CY, Gong L, et al. Oncogenic extrachromosomal DNA functions as mobile enhancers to globally amplify chromosomal transcription. Cancer Cell. 2021;39:694–707.e7.

Hung KL, Yost KE, Xie L, Shi Q, Helmsauer K, Luebeck J, et al. ecDNA hubs drive cooperative intermolecular oncogene expression. Nature. 2021;600:731–6.

Yi E, Gujar AD, Guthrie M, Kim H, Zhao D, Johnson KC, et al. Live-Cell Imaging Shows Uneven Segregation of Extrachromosomal DNA Elements and Transcriptionally Active Extrachromosomal DNA Hubs in Cancer. Cancer Discov. 2022;12:468–83.

Nathanson DA, Gini B, Mottahedeh J, Visnyei K, Koga T, Gomez G, et al. Targeted therapy resistance mediated by dynamic regulation of extrachromosomal mutant EGFR DNA. Science. 2014;343:72–6.

Bergstrom EN, Luebeck J, Petljak M, Khandekar A, Barnes M, Zhang T, et al. Mapping clustered mutations in cancer reveals APOBEC3 mutagenesis of ecDNA. Nature. 2022;602:510–7.

Lange JT, Rose JC, Chen CY, Pichugin Y, Xie L, Tang J, et al. The evolutionary dynamics of extrachromosomal DNA in human cancers. Nat Genet. 2022;54:1527–33.

Zhang P, Mbodj A, Soundiramourtty A, Llauro C, Ghesquière A, Ingouff M, et al. Extrachromosomal circular DNA and structural variants highlight genome instability in Arabidopsis epigenetic mutants. Nat Commun. 2023;14:5236.

Yang F, Su W, Chung OW, Tracy L, Wang L, Ramsden DA, et al. Retrotransposons hijack alt-EJ for DNA replication and eccDNA biogenesis. Nature. 2023;620:218–25.

Kumar P, Dillon LW, Shibata Y, Jazaeri AA, Jones DR, Dutta A. Normal and Cancerous Tissues Release Extrachromosomal Circular DNA (eccDNA) into the Circulation. Mol Cancer Res. 2017;15:1197–205.

Sin STK, Jiang P, Deng J, Ji L, Cheng SH, Dutta A, et al. Identification and characterization of extrachromosomal circular DNA in maternal plasma. Proc Natl Acad Sci USA. 2020;117:1658–65.

Luo X, Zhang L, Cui J, An Q, Li H, Zhang Z, et al. Small extrachromosomal circular DNAs as biomarkers for multi-cancer diagnosis and monitoring. Clin Transl Med. 2023;13:e1393.

Wu T, Wu C, Zhao X, Wang G, Ning W, Tao Z, et al. Extrachromosomal DNA formation enables tumor immune escape potentially through regulating antigen presentation gene expression. Sci Rep. 2022;12:3590.

Kunisada T, Yamagishi H, Ogita Z, Kirakawa T, Mitsui Y. Appearance of extrachromosomal circular DNAs during in vivo and in vitro ageing of mammalian cells. Mech Ageing Dev. 1985;29:89–99.

Hull RM, King M, Pizza G, Krueger F, Vergara X, Houseley J. Transcription-induced formation of extrachromosomal DNA during yeast ageing. PLoS Biol. 2019;17:e3000471.

Deshpande V, Luebeck J, Nguyen ND, Bakhtiari M, Turner KM, Schwab R, et al. Exploring the landscape of focal amplifications in cancer using AmpliconArchitect. Nat Commun. 2019;10:392.

Møller HD, Bojsen RK, Tachibana C, Parsons L, Botstein D, Regenberg B. Genome-wide Purification of Extrachromosomal Circular DNA from Eukaryotic Cells. J Vis Exp. 2016;110:e54239.

Møller HD. Circle-Seq: Isolation and Sequencing of Chromosome-Derived Circular DNA Elements in Cells. Methods Mol Biol. 2020;2119:165–81.

Mehta D, Cornet L, Hirsch-Hoffmann M, Zaidi SS, Vanderschuren H. Full-length sequencing of circular DNA viruses and extrachromosomal circular DNA using CIDER-Seq. Nat Protoc. 2020;15:1673–89.

Shoura MJ, Gabdank I, Hansen L, Merker J, Gotlib J, Levene SD, et al. Intricate and Cell Type-Specific Populations of Endogenous Circular DNA (eccDNA) in Caenorhabditis elegans and Homo sapiens. G3 Bethesda. 2017;7:3295–303.

Kumar P, Kiran S, Saha S, Su Z, Paulsen T, Chatrath A, et al. ATAC-seq identifies thousands of extrachromosomal circular DNA in cancer and cell lines. Sci Adv. 2020;6:eaba2489.

Fan X, Yang C, Li W, Bai X, Zhou X, Xie H, et al. SMOOTH-seq: single-cell genome sequencing of human cells on a third-generation sequencing platform. Genome Biol. 2021;22:195.

Chamorro González R, Conrad T, Stöber MC, Xu R, Giurgiu M, Rodriguez-Fos E, et al. Parallel sequencing of extrachromosomal circular DNAs and transcriptomes in single cancer cells. Nat Genet. 2023;55:880–90.

Chang L, Deng E, Wang J, Zhou W, Ao J, Liu R, et al. Single-cell third-generation sequencing-based multi-omics uncovers gene expression changes governed by ecDNA and structural variants in cancer cells. Clin Transl Med. 2023;13:e1351.

Disentangling extrachromosomal circular DNA heterogeneity in single cells with scEC&T-seq. Nat Genet. 2023;55:740–1.

Yang L, Jia R, Ge T, Ge S, Zhuang A, Chai P, et al. Extrachromosomal circular DNA: biogenesis, structure, functions and diseases. Signal Transduct Target Ther. 2022;7:342.

Prada-Luengo I, Møller HD, Henriksen RA, Gao Q, Larsen CE, Alizadeh S, et al. Replicative aging is associated with loss of genetic heterogeneity from extrachromosomal circular DNA in Saccharomyces cerevisiae. Nucleic Acids Res. 2020;48:7883–98.

Mann L, Seibt KM, Weber B, Heitkam T. ECCsplorer: a pipeline to detect extrachromosomal circular DNA (eccDNA) from next-generation sequencing data. BMC Bioinforma. 2022;23:40.

Vogt N, Gibaud A, Lemoine F, de la Grange P, Debatisse M, Malfoy B. Amplicon rearrangements during the extrachromosomal and intrachromosomal amplification process in a glioma. Nucleic Acids Res. 2014;42:13194–205.

Hayes M, Li J. An integrative framework for the identification of double minute chromosomes using next generation sequencing data. BMC Genet. 2015;16:S1. Suppl 2

Luebeck J, Coruh C, Dehkordi SR, Lange JT, Turner KM, Deshpande V, et al. AmpliconReconstructor integrates NGS and optical mapping to resolve the complex structures of focal amplifications. Nat Commun. 2020;11:4374.

Hayes M, Nguyen A, Islam R, Butler C, Tran E, Mullins D, et al. HolistIC: leveraging Hi-C and whole genome shotgun sequencing for double minute chromosome discovery. Bioinformatics. 2022;38:1208–15.

Mehta D, Hirsch-Hoffmann M, Were M, Patrignani A, Zaidi SS, Were H, et al. A new full-length circular DNA sequencing method for viral-sized genomes reveals that RNAi transgenic plants provoke a shift in geminivirus populations in the field. Nucleic Acids Res. 2019;47:e9.

Zhang P, Peng H, Llauro C, Bucher E, Mirouze M. ecc_finder: A Robust and Accurate Tool for Detecting Extrachromosomal Circular DNA From Sequencing Data. Front Plant Sci. 2021;12:743742.

Wang Y, Wang M, Zhang Y. Purification, full-length sequencing and genomic origin mapping of eccDNA. Nat Protoc. 2023;18:683–99.

Tüns AI, Hartmann T, Magin S, González RC, Henssen AG, Rahmann S, et al. Detection and Validation of Circular DNA Fragments Using Nanopore Sequencing. Front Genet. 2022;13:867018.

Li F, Ming W, Lu W, Wang Y, Li X, Dong X, et al. FLED: a full-length eccDNA detector for long-reads sequencing data. Brief Bioinform. 2023;24:bbad388.

Wanchai V, Jenjaroenpun P, Leangapichart T, Arrey G, Burnham CM, Tümmler MC, et al. CReSIL: accurate identification of extrachromosomal circular DNA from long-read sequences. Brief Bioinform. 2022;23:bbac422.

Yang M, Zhang S, Jiang R, Chen S, Huang M. Circlehunter: a tool to identify extrachromosomal circular DNA from ATAC-Seq data. Oncogenesis. 2023;12:28.

Cheng H, Ma W, Wang K, Chu H, Bao G, Liao Y, et al. ATACAmp: a tool for detecting ecDNA/HSRs from bulk and single-cell ATAC-seq data. BMC Genomics. 2023;24:678.

Fang M, Fang J, Luo S, Liu K, Yu Q, Yang J, et al. eccDNA-pipe: an integrated pipeline for identification, analysis and visualization of extrachromosomal circular DNA from high-throughput sequencing data. Brief Bioinform. 2024;25:bbae034.

Chang KL, Chen JH, Lin TC, Leu JY, Kao CF, Wong JY, et al. Short human eccDNAs are predictable from sequences. Brief Bioinform. 2023;24:bbad147.

Huang W, Li L, Myers JR, Marth GT. ART: a next-generation sequencing read simulator. Bioinformatics. 2012;28:593–4.

Yang C, Chu J, Warren RL, Birol I. NanoSim: nanopore sequence read simulator based on statistical characterization. Gigascience. 2017;6:1–6.

Zhao X, Shi L, Ruan S, Bi W, Chen Y, Chen L, et al. CircleBase: an integrated resource and analysis platform for human eccDNAs. Nucleic Acids Res. 2022;50:D72–d82.

Peng L, Zhou N, Zhang CY, Li GC, Yuan XQ. eccDNAdb: a database of extrachromosomal circular DNA profiles in human cancers. Oncogene. 2022;41:2696–705.

Guo J, Zhang Z, Li Q, Chang X, Liu X. TeCD: The eccDNA Collection Database for extrachromosomal circular DNA. BMC Genomics. 2023;24:47.

Zhong T, Wang W, Liu H, Zeng M, Zhao X, Guo Z. eccDNA Atlas: a comprehensive resource of eccDNA catalog. Brief Bioinform. 2023;24:bbad037.

Yang M, Qiu B, He GY, Zhou JY, Yu HJ, Zhang YY, et al. eccDB: a comprehensive repository for eccDNA-mediated chromatin contacts in multi-species. Bioinformatics. 2023;39:btad173.

Sun H, Lu X, Zou L. EccBase: A high-quality database for exploration and characterization of extrachromosomal circular DNAs in cancer. Comput Struct Biotechnol J. 2023;21:2591–601.

Khatami F, Larijani B, Tavangar SM. The presence of tumor extrachomosomal circular DNA (ecDNA) as a component of liquid biopsy in blood. Med Hypotheses. 2018;114:5–7.

Ma XK, Zhai SN, Yang L. Approaches and challenges in genome-wide circular RNA identification and quantification. Trends Genet. 2023;39:897–907.

Kaufman RJ, Brown PC, Schimke RT. Amplified dihydrofolate reductase genes in unstably methotrexate-resistant cells are associated with double minute chromosomes. Proc Natl Acad Sci USA. 1979;76:5669–73.

Cohen S, Lavi S. Induction of circles of heterogeneous sizes in carcinogen-treated cells: two-dimensional gel analysis of circular DNA molecules. Mol Cell Biol. 1996;16:2002–14.

Møller HD, Lin L, Xiang X, Petersen TS, Huang J, Yang L, et al. CRISPR-C: circularization of genes and chromosome by CRISPR in human cells. Nucleic Acids Res. 2018;46:e131.

PubMed   PubMed Central   Google Scholar  

Hung KL, Luebeck J, Dehkordi SR, Colón CI, Li R, Wong IT, et al. Targeted profiling of human extrachromosomal DNA by CRISPR-CATCH. Nat Genet. 2022;54:1746–54.

Luebeck J, Ng AWT, Galipeau PC, Li X, Sanchez CA, Katz-Summercorn AC, et al. Extrachromosomal DNA in the cancerous transformation of Barrett’s oesophagus. Nature. 2023;616:798–805.

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (grant number: 61871121).

Author information

Authors and affiliations.

State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, PR China

Fuyu Li, Wenxiang Lu, Ying Wang & Yunfei Bai

Institute for AI in Medicine, School of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing, 210044, PR China

Wenlong Ming

Adams Center of Parkinson’s Disease Research, Yale School of Medicine, Yale University, 100 College St, New Haven, CT, 06511, USA

Xianjun Dong

Department of Neurology, Yale School of Medicine, Yale University, 100 College St, New Haven, CT, 06511, USA

You can also search for this author in PubMed   Google Scholar

Contributions

YB provided direction and guidance throughout the preparation of this manuscript. FL collected and interpreted the studies, and wrote the manuscript with the assistance of WM and WL. Both WM and YW made significant contributions to the final manuscript. Both XD and YB reviewed and made significant revisions to the manuscript. All authors read and approved the submission and publication.

Corresponding authors

Correspondence to Wenlong Ming , Xianjun Dong or Yunfei Bai .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary materials, supplementary table1, supplementary table2, rights and permissions.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Li, F., Ming, W., Lu, W. et al. Bioinformatics advances in eccDNA identification and analysis. Oncogene (2024). https://doi.org/10.1038/s41388-024-03138-6

Download citation

Received : 20 March 2024

Revised : 09 August 2024

Accepted : 16 August 2024

Published : 29 August 2024

DOI : https://doi.org/10.1038/s41388-024-03138-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

research data analysis tools

  • Open access
  • Published: 28 August 2024

Transforming simulation in healthcare to enhance interprofessional collaboration leveraging big data analytics and artificial intelligence

  • Salman Yousuf Guraya 1  

BMC Medical Education volume  24 , Article number:  941 ( 2024 ) Cite this article

10 Altmetric

Metrics details

Simulation in healthcare, empowered by big data analytics and artificial intelligence (AI), has the potential to drive transformative innovations towards enhanced interprofessional collaboration (IPC). This convergence of technologies revolutionizes medical education, offering healthcare professionals (HCPs) an immersive, iterative, and dynamic simulation platform for hands-on learning and deliberate practice. Big data analytics, integrated in modern simulators, creates realistic clinical scenarios which mimics real-world complexities. This optimization of skill acquisition and decision-making with personalized feedback leads to life-long learning. Beyond clinical training, simulation-based AI, virtual reality (VR), and augmented reality (AR) automated tools offer avenues for quality improvement, research and innovation, and team working. Additionally, the integration of VR and AR enhances simulation experience by providing realistic environments for practicing high-risk procedures and personalized learning. IPC, crucial for patient safety and quality care, finds a natural home in simulation-based education, fostering teamwork, communication, and shared decision-making among diverse HCP teams. A thoughtful integration of simulation-based medical education into curricula requires overcoming its barriers such as professional silos and stereo-typing. There is a need for a cautious implantation of technology in clinical training without overly ignoring the real patient-based medical education.

Peer Review reports

Simulation in healthcare, powered by big data analytics (BDA) and artificial intelligence (AI), stands at the forefront of transformative innovations with a promise to facilitating interprofessional collaboration (IPC). This convergence of technologies towards educational philosophies not only revolutionizes medical training but also enhances the quality of care and patient safety in an IPC climate for an efficient delivery of healthcare system [ 1 ]. Simulation in healthcare showcases a controlled, versatile, and safe environment for healthcare professionals (HCPs) from diverse disciplines to engage in hands-on learning with deliberate practice [ 2 ]. Learners are engrossed in immersive, iterative, and interactive climate which can nurture opportunities for the acquisition of transferable psychomotor and cognition-based skills [ 3 ]. A simulated environment nurtures the real jest of life-long learning where learners can be trained by deliberate practice till the acquisition of their skills.

BDA, embedded in modern cutting-edge simulators, can utilize enormous healthcare data for clinical training and skills acquistion [ 4 ]. For instance, Bateman and Wood employed Amazon’s Web Service to accumulate a complete human genomic scaffold including 140 million individual base pairs by adopting an advanced hashing algorithm [ 5 ]. Later, a BDA platform successfully matched patients’ data of children in hospital to their whole-genome sequencing for the management of potentially incurable clinical conditions [ 6 ]. From another perspective, leveraging clinical scenarios with realism, BDA can be a valuable tool in reflecting the complexities of the real-world medical practice. This data-driven approach diligently mimics the variability and inconsistency encountered in real clinical settings, preparing HCPs for diverse patient encounters and crisis management. Artificial intelligence (AI) with its machine learning algorithm (MLA) and natural language processing (NLP) further fortifies the impact of simulation by enabling adaptive learning experiences [ 7 ]. Moreover, AI-powered patient simulators with automated interfaces can demonstrate high fidelity realistic physiological responses such as pulse, blood pressure, breathing patterns, and facial expressions to allow learners to practice decision-making in lifelike scenarios. By analyzing simulation data, institutions can identify trends, best practices, and areas for improvement, ultimately enhancing patient outcomes and advancing medical knowledge.

Applications of BDA harness the experimental usage of electronic health records, medical imaging, genetic information, and patients’ demographics. By aggregating and analyzing this data, simulation platforms can create realistic scenarios that can be used by learners for clinical reasoning and critical decision-making. Additionally, MLA and NLP have the ability to predict disease prognosis, treatment efficacy, and unwanted outcomes, thereby offering a reliable hub for interactive and immersive learning for HCPs [ 8 ]. MLA and NLP encourage adaptive learning experiences by analyzing learner interactions and performance in real-time. This unique opportunity of acquiring skills mastery with personalized feedback either by simulator, peer, or facilitator makes simulation a master-class educational and training tool for all HCPs. For instance, if a learner consistently makes errors in decision-making or a procedural skill, a smart simulator can tailor further exercises to provide targeted practice opportunities for individual learners.

Clinical training is interposed at the crossroads of adopting AI, virtual reality (VR), and augmented reality (AR) technologies. Beyond training, simulation-driven medical education holds immense potential for quality improvement and research in healthcare [ 9 ]. VR and AR technologies offer immersive experiences that simulate clinical settings with unprecedented realism. VR headsets transform learners into a cyber space where they deal with animations, digital images, and a host of other exercises in virtual climate [ 10 ]. AR overlays digital information onto the physical world, allowing learners to visualize anatomical structures, medical procedures, or patient data in real-time. Moreover, VR and AR can be used to perform high risk medical procedures till the complete acquisition of skill mastery. Such opportunity is not possible due to threats to patient safety and limited time for learners’ training in real-world workplaces [ 11 ]. At the same time, the mapping of learners’ needs with the curriculum is possible only in simulated environment where learners’ expectations can be tailored to meet their learning styles [ 11 ]. AI, VR, and AR technologies in healthcare simulators essentially empower learners to develop clinical expertise, enhance patient care, and drive innovations in healthcare delivery.

An example of integration of AI, NP, ML, and certain other algorithms in simulation is the sepsis management of a virtual patient being managed by a team of HCPs from different healthcare disciplines. A patient presents with fever, confusion, and rapid breathing in the emergency room. AI platform creates a detailed medical record of the patient with past hospital visits, medications, allergies, and baseline health metrics. AI simulates patient’s symptoms in real-time with tachycardia, tachypnea, hypotension, and fever. The trainees interview the virtual patient and AI responds, using NLP, by providing coherent and contextually appropriate answers. The trainees order a set of tests, including blood cultures, a complete blood count, and lactate levels. AI presents realistic test results where blood cultures show a bacterial infection, leukocytosis, and elevated lactate levels. Based on the diagnosis of sepsis, the trainees plan treatment which typically includes oxygen, broad-spectrum antibiotics, and intravenous fluid. AI then adjusts the patient condition based on the trainees’ actions which may lead to improvement in clinical parameters. However, a delayed treatment could lead to worsening symptoms such as septic shock. Furthermore, AI can introduce complications if initial treatments were ineffective or if the trainees commit errors. Thereupon, AI provides real-time feedback on the trainees’ decisions which can highlight missed signs, suggest alternative diagnostic tests, or recommend adjustments to treatment plans. Lastly, AI would generate a summary report of the performance with a breakdown of diagnostic accuracy, treatment efficacy, and adherence to clinical guidelines. MLAs analyze patterns in patient data to assist in diagnosis. In this context, decision trees and neural networks of MLAs analyze vast datasets of patient records to create realistic virtual patients with diverse medical histories and clinical conditions.

There has been a proliferation of empirical research about the powerful role of IPC in medical education [ 12 , 13 ]. IPC fosters shared decision-making, role identification and negotiations, team coherence, and mitigates potential errors [ 14 ]. Through simulated scenarios, HCPs learn to navigate interdisciplinary challenges, appreciate each other’s roles, and develop a shared approach to patient care. Additionally, simulation in healthcare faces the challenges of costs, access, development, and ethical considerations. Nevertheless, the integration of simulation, BDA, VR, AR, and AI heralds a new era of IPC in healthcare, where learning, practice, and innovation converge to shape the future of medicine.

The overarching goal of all healthcare systems focuses on patient safety as reiterated by the World Health Organization (WHO) sustainable development goals [ 15 ]. General Medical Council, Irish Medical Council, Canada MEDs, Accreditation Council for Graduate Medical Education, and EmiatesMEDS are also in agreement with WHO and, in this context, IPC can potentially enhance the quality of care and patient safety [ 16 ]. Though the role of IPC is widely accepted, there is a lukewarm response from medical institutions about its integration into the existing curricula. Professional silos, stereotyping, bureaucratic inertia, and resistant mindsets are some of the deterring factors [ 17 ]. In the era of simulation in healthcare, IPC can be efficiently embedded into this technology-powered educational tool for impactful collaborative teamwork. By harnessing the technological power of VR, AR, and AI, simulation platforms can leverage the indigenous advantage of IPC in clinical training. Once skills acquisition is accomplished in the simulated platform, its recreation in the real world would be a seamless transition of transferable skills.

To sum up, despite an exponential growth in the use of technology-driven simulation in healthcare, educators should be mindful of its careful integration in medical curricula. Clinical training on real patients cannot be replaced by any strategy or tool regardless of its perceived efficiency or effectiveness. Bearing in mind the learning styles of our learners with a preference toward fluid than crystalloid verbal comprehension and fluid reasoning, technology-driven simulation plays a vital role in medical education. A thoughtful integration of simulation pitched at certain courses and modules spiraled across the curriculum will enhance the learning experience of medical and health sciences students and HCPs [ 18 ].

Data availability

No datasets were generated or analysed during the current study.

Choudhury A, Asan O. Role of artificial intelligence in patient safety outcomes: systematic literature review. JMIR Med Inf. 2020;8(7):e18599.

Article   Google Scholar  

Higgins M, Madan CR, Patel R. Deliberate practice in simulation-based surgical skills training: a scoping review. J Surg Educ. 2021;78(4):1328–39.

Watts PI, McDermott DS, Alinier G, Charnetski M, Ludlow J, Horsley E, et al. Healthcare simulation standards of best practiceTM simulation design. Clin Simul Nurs. 2021;58:14–21.

Chrimes D, Moa B, Zamani H, Kuo M-H, editors. Interactive healthcare big data analytics platform under simulated performance. 2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech); 2016: IEEE.

Bateman A, Wood M. Cloud computing. Oxford University Press; 2009. p. 1475.

Twist GP, Gaedigk A, Miller NA, Farrow EG, Willig LK, Dinwiddie DL, et al. Constellation: a tool for rapid, automated phenotype assignment of a highly polymorphic pharmacogene, CYP2D6, from whole-genome sequences. NPJ Genomic Med. 2016;1(1):1–10.

Winkler-Schwartz A, Bissonnette V, Mirchi N, Ponnudurai N, Yilmaz R, Ledwos N, et al. Artificial intelligence in medical education: best practices using machine learning to assess surgical expertise in virtual reality simulation. J Surg Educ. 2019;76(6):1681–90.

Li WT, Ma J, Shende N, Castaneda G, Chakladar J, Tsai JC, et al. Using machine learning of clinical data to diagnose COVID-19: a systematic review and meta-analysis. BMC Med Inf Decis Mak. 2020;20:1–13.

Google Scholar  

Caffò AO, Tinella L, Lopez A, Spano G, Massaro Y, Lisi A, et al. The drives for driving simulation: a scientometric analysis and a selective review of reviews on simulated driving research. Front Psychol. 2020;11:917.

Hsieh M-C, Lee J-J. Preliminary study of VR and AR applications in medical and healthcare education. J Nurs Health Stud. 2018;3(1):1.

Forgione A, Guraya SY. The cutting-edge training modalities and educational platforms for accredited surgical training: a systematic review. J Res Med Sci. 2017;22(1):51.

Sulaiman N, Rishmawy Y, Hussein A, Saber-Ayad M, Alzubaidi H, Al Kawas S, et al. A mixed methods approach to determine the climate of interprofessional education among medical and health sciences students. BMC Med Educ. 2021;21:1–13.

Guraya SY, David LR, Hashir S, Mousa NA, Al Bayatti SW, Hasswan A, et al. The impact of an online intervention on the medical, dental and health sciences students about interprofessional education; a quasi-experimental study. BMC Med Educ. 2021;21:1–11.

Wei H, Corbett RW, Ray J, Wei TL. A culture of caring: the essence of healthcare interprofessional collaboration. J Interprof Care. 2020;34(3):324–31.

Organization WH. Global patient safety action plan 2021–2030: towards eliminating avoidable harm in health care. World Health Organization; 2021.

Guraya SS, Umair Akhtar M, Sulaiman N, David LR, Jirjees FJ, Awad M, et al. Embedding patient safety in a scaffold of interprofessional education; a qualitative study with thematic analysis. BMC Med Educ. 2023;23(1):968.

Supper I, Catala O, Lustman M, Chemla C, Bourgueil Y, Letrilliart L. Interprofessional collaboration in primary health care: a review of facilitators and barriers perceived by involved actors. J Public Health. 2015;37(4):716–27.

Guraya SS, Guraya SY, Al-Qahtani MF. Developing a framework of simulation-based medical education curriculum for effective learning. Med Educ. 2020;24(4):323–31.

Download references

Acknowledgements

Not applicable.

Author information

Authors and affiliations.

Vice Dean College of Medicine, University of Sharjah, Sharjah, United Arab Emirates

Salman Yousuf Guraya

You can also search for this author in PubMed   Google Scholar

Contributions

This is a sole author manuscript. Salman Guraya conceived, prepraed, revieweed, revised, and finalized this editorial artcile.

Corresponding author

Correspondence to Salman Yousuf Guraya .

Ethics declarations

Ethics approval and consent to participate, consent for publication.

Not applicable as this is an editorial article.

Competing interests

Corresponding author is a senior editorial board memeber of the BMC Medical Education.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Guraya, S.Y. Transforming simulation in healthcare to enhance interprofessional collaboration leveraging big data analytics and artificial intelligence. BMC Med Educ 24 , 941 (2024). https://doi.org/10.1186/s12909-024-05916-y

Download citation

Received : 06 May 2024

Accepted : 14 August 2024

Published : 28 August 2024

DOI : https://doi.org/10.1186/s12909-024-05916-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Big data analytics
  • Augmented reality
  • Virtual reality
  • Artificial intelligence
  • Medical education

BMC Medical Education

ISSN: 1472-6920

research data analysis tools

  • Open access
  • Published: 31 August 2024

Incidence of post-extubation dysphagia among critical care patients undergoing orotracheal intubation: a systematic review and meta-analysis

  • Weixia Yu 1   na1 ,
  • Limi Dan 1   na1 ,
  • Jianzheng Cai 1 ,
  • Yuyu Wang 1 ,
  • Qingling Wang 1 ,
  • Yingying Zhang 1 &
  • Xin Wang 1  

European Journal of Medical Research volume  29 , Article number:  444 ( 2024 ) Cite this article

10 Altmetric

Metrics details

Post-extubation dysphagia (PED) emerges as a frequent complication following endotracheal intubation within the intensive care unit (ICU). PED has been strongly linked to adverse outcomes, including aspiration, pneumonia, malnutrition, heightened mortality rates, and prolonged hospitalization, resulting in escalated healthcare expenditures. Nevertheless, the reported incidence of PED varies substantially across the existing body of literature. Therefore, the principal objective of this review was to provide a comprehensive estimate of PED incidence in ICU patients undergoing orotracheal intubation.

We searched Embase, PubMed, Web of Science, Cochrane Library, China National Knowledge Infrastructure (CNKI), Wanfang Database, China Science, Technology Journal Database (VIP), and SinoMed databases from inception to August 2023. Two reviewers independently screened studies and extracted data. Subsequently, a random-effects model was employed for meta-statistical analysis utilizing the “meta prop” command within Stata SE version 15.0 to ascertain the incidence of PED. In addition, we performed subgroup analyses and meta-regression to elucidate potential sources of heterogeneity among the included studies.

Of 4144 studies, 30 studies were included in this review. The overall pooled incidence of PED was 36% (95% confidence interval [CI] 29–44%). Subgroup analyses unveiled that the pooled incidence of PED, stratified by assessment time (≤ 3 h, 4–6 h, ≤ 24 h, and ≤ 48 h), was as follows: 31.0% (95% CI 8.0–59.0%), 28% (95% CI 22.0–35.0%), 41% (95% CI 33.0–49.0%), and 49.0% (95% CI 34.0–63.0%), respectively. When sample size was 100 <  N  ≤ 300, the PED incidence was more close to the overall PED incidence. Meta-regression analysis highlighted that sample size, assessment time and mean intubation time constituted the source of heterogeneity among the included studies.

The incidence of PED was high among ICU patients who underwent orotracheal intubation. ICU professionals should raise awareness about PED. In the meantime, it is important to develop guidelines or consensus on the most appropriate PED assessment time and assessment tools to accurately assess the incidence of PED.

Graphical abstract

research data analysis tools

Introduction

Mechanical ventilation is the most common technological support, being required by 20–40% of adult in ICU [ 1 ]. Orotracheal intubation is the primary way of mechanical ventilation in ICU, which can increase the risk of post-extubation dysphagia (PED) [ 2 , 3 ]. PED is any form of swallowing dysfunction that arises subsequent to extubation following endotracheal intubation, affecting the passage of food from the entrance to the stomach. The occurrence rate of PED within the ICU setting demonstrates considerable variation among different countries [ 4 ]. The incidence varied among countries, including 13.3–61.8% in the United States [ 5 , 6 ], 25.3–43.5% in France, and 23.2–56% in China [ 7 , 8 ], and the incidence ranging from 7 to 80% [ 9 , 10 ]. Significantly, PED standing out as a prominent complication encountered in this particular context. For instance, See et al. have elucidated that patients afflicted with PED face an 11-fold higher risk of aspiration compared to those without PED [ 11 ]. McIntyre et al. have underscored that patients afflicted with PED endure double the length of stay in the ICU and the overall hospitalization period when compared to patients without PED [ 10 ]. Furthermore, it is essential to note that PED emerged as an independent predictor of 28-day and 90-day mortality [ 12 ]. This high incidence of PED places an immense burden not only on patients but also on the broader healthcare system. Therefore, a systematic review and meta-analysis is necessary to explore the incidence of PED in ICU patients. A systematic review and meta-analysis conducted by McIntyre et al. reported that the incidence of PED was 41%, but the main outcomes of their partly included studies was aspiration [ 12 ]. Although aspiration and PED are closely related, not all aspiration is caused by dysphagia. The incidence of aspiration was 8.80%-88.00% in ICU [ 13 , 14 ], so the incidence of PED in that study may be overestimated. Moreover, there has been increasing literature on PED of ICU patients, and a new systematic review and meta-analysis is needed to obtain a more precise estimate of its incidence.

The incidence of PED may indeed vary depending on various covariates, including assessment time, mean intubation time, age and other relevant factors. First, there is no standard time for swallowing function assessment, which spans a range of intervals, including 3 h [ 6 , 9 , 12 ], 4–6 h [ 15 , 16 ], 24 h [ 17 , 18 , 19 ], 48 h [ 20 ], 7 days [ 21 ], and discharge [ 22 ], and the incidence of PED was 80% [ 9 ], 22.62% [ 15 ], 56.06% [ 18 ], and 35.91% [ 20 ], 22.06% [ 21 ], and 28.78% [ 22 ], respectively. Second, the PED is closely tied to the time of orotracheal intubation. Skoretz et al. have demonstrated that the overall incidence of PED in the ICU ranges from 3 to 4%. However, upon re-analysis of patients subjected to orotracheal intubation for more than 48 h, the PED incidence can surge as high as 51% [ 23 ]. Third, the choice of assessment tool to evaluate PED in ICU patients plays a pivotal role. These assessment tools may include Video-fluoroscopic Swallowing Study (VFSS), Fiberoptic Endoscopic Evaluation of Swallowing (FEES), Standardized Swallowing Assessment (SSA), Bedside Swallowing Evaluation (BSE), Gugging Swallowing Screen (GUSS), Post-Extubation Dysphagia Screening Tool (PEDS), Water Swallowing Test (WST) and other assessment tools. FEES and VFSS are considered the gold standards, with a detection rate of approximately 80% [ 9 ]. SSA and BSE exhibit detection rates of 22% and 62%, respectively [ 5 , 15 ]. Finally, age-related changes in laryngeal sensory and motor functions also influence PED risk [ 24 ]. Notably, there may not be a significant difference in the incidence of PED between elderly and young patients within the initial 48 h post-extubation. However, elderly patients exhibit a significantly slower rate of PED recovery compared to their younger counterparts over time (5.0 days vs 3.0 days; p  = 0.006) [ 5 ]. Therefore, it is necessary to explore the potential source of heterogeneity in the incidence of PED in ICU patients from such covariates.

The purpose of this study was to estimate the incidence of PED among ICU patients who underwent orotracheal intubation and investigate potential sources of heterogeneity through the application of subgroup analyses and meta-regression.

This systematic review and meta-analysis was conducted adhering to the guidelines outlined in the Joanna Briggs Institute (JBI) Reviewers’ Manual and followed the principles of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses 2020 statement (PRISMA 2020) [ 25 ] (see Additional file 1: Table S1). In addition, it was registered with PROSPERO under the registration number CRD42022373300.

Eligibility criteria

The study’s eligibility criteria were established in accordance with the PICOS principle. Inclusion criteria as follows: population (P): adult patients (≥ 18 years old) admitted to the ICU who underwent orotracheal intubation. Exposure (E): undergoing orotracheal intubation. Outcome (O): PED. Study design (S): observational study (cohort, case–control, cross-sectional study). In studies where multiple articles were derived from the same sample, only the article providing the most detailed data was included. Patients at high risk of dysphagia (such as those with head and neck cancer, who have undergone head and neck surgery, patients receiving palliative care, esophageal dysfunction, stroke, esophageal cancer and Parkinson’s disease) were excluded. Studies were excluded if they exhibited incomplete original data or data that could not be extracted. Studied were also excluded if their sample sizes fell below 30 participants or the full text was inaccessible.

Data sources and search strategy

Our comprehensive search multiple databases, including Embase, PubMed, Web of Science, Cochrane Library, China National Knowledge Infrastructure (CNKI), Wanfang, China Science and Technology Journal Database (VIP), and SinoMed, with the search period encompassing inception to August 18, 2023. Search language was Chinese and English. The limited number of studies retrieved initially, primarily attributed to the inclusion of the qualifier “ICU” in the initial search, prompted us to broaden the scope of our literature search. Consequently, we refined the search strategy by reducing the emphasis on “ICU” during the search process. After a series of preliminary searches, we finalized the search strategy, which combined subject headings and free-text terms while employing Boolean operators to enhance search precision. In addition, a manual hand-search of the reference lists of selected articles was carried out to identify any supplementary studies not originally identified through the electronic search. For a detailed presentation of our complete search strategies across all databases, please refer to Additional file 1: Table S2.

Quality evaluation

The evaluation of the risk of bias within the included studies was conducted by two trained investigators. Cross-sectional study was evaluated by the Agency for Healthcare Research and Quality (AHRQ) tool [ 26 ], which consisted of 11 items, resulting in a maximum score of 11. Scores falling within the ranges of 0–3, 4–7, and 8–11 corresponded to studies of poor, moderate, and high quality, respectively. Cohort study was evaluated by the Newcastle–Ottawa Scale (NOS) tool [ 27 ], which comprised three dimensions and eight items, allowing for a star rating ranging from 2 to 9 stars. In this case, 0–4, 5–6, and 7–9 stars were indicative of study of poor, moderate, and high quality, respectively. Any discrepancies or disagreements between the investigators were resolved through discussion, when necessary, consultation with a third expert specializing in evidence-based practice methodology.

Study selection and data extraction

Bibliographic records were systematically exported into the NoteExpress database to facilitate the screening process and the removal of duplicate citations. Initial screening, based on titles and abstracts, was conducted by two reviewers who possessed specialized training in evidence-based knowledge. To ascertain whether the studies satisfied the predefined inclusion and exclusion criteria, the full texts of potentially relevant articles were acquired. In the event of disagreements between the two reviewers, resolution was achieved through discussion or, when necessary, by enlisting the input of a third reviewer for arbitration.

After confirming the included studies, the two authors independently extracted data from the each paper, including the first author, year of publication, country, study design, ICU type, mean patient age, mean intubation time, assessment time, assessment tool, evaluator, sample size, and the PED event. Any disparities during the process of extracted data were addressed through thorough discussion and consensus-building among the reviewers.

The outcomes of this review were as follows: (1) incidence of PED in patients with orotracheal intubation in the ICU; (2) sources of heterogeneity of PED in patients with orotracheal intubation in ICU.

Statistical analyses

Meta-analysis was conducted using the ‘meta prop’ function from the meta package within STATA/SE (version 15.0, StataCorp, TX, USA). To approximate the normal distribution of the data, incidence estimates were transformed using the “Freeman-Tukey Double Arcsine Transformation”. Heterogeneity was assessed using the I 2 statistic, and pooled analyses of PED were executed employing a random-effects model in the presence of significant heterogeneity ( I 2  ≥ 50%), with fixed-effects models utilized when heterogeneity was non-significant. A significance level of P  < 0.05 was established for all analyses.

Subgroup analyses were undertaken to investigate the potential impact of various factors, including assessment tool (gold standard, SSA, GUSS, BSE, PEDS, WST, and other assessment tools), year of publication (2000–2010, 2011–2015, 2016–2020, 2021–2023), study design (cross-sectional study and cohort study), study quality (moderate quality and high quality), assessment time (≤ 3 h, 4–6 h, ≤ 24 h, ≤ 48 h, and after 48 h post-extubation), mean intubation time (≤ 24 h, 48 – 168 h, and > 168 h), mean patient age (≤ 44 years, 45–59 years, 60–74 years), evaluator (nurses, speech-language pathologist), ICU type (Trauma ICU, Cardiac surgery ICU, Mixed medical and surgical ICU), and sample size ( N  ≤ 100, 100 <  N  ≤ 200, 200 <  N  ≤ 300, N  > 300) on the pooled estimate. In instances where no source of heterogeneity was identified in the subgroup analyse, we conducted meta-regression to further pinpoint the origins of heterogeneity, focusing on assessment time, mean intubation time, mean age, assessment tool, sample size, evaluator, ICU type, study design, study quality and year of publication. Sensitivity analysis by the “leave-one-out method” was employed to evaluate the random-effects model’s stability of the pooled incidence of PED. Publication bias was assessed by funnel plot and “Trim and Full” method.

Certainty of the evidence

The level of evidence was assessed using the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) [ 28 ]. This tool classifies the certainty of evidence into four levels: very low, low, moderate, and high. “High quality” suggests that the actual effect is approximate to the estimate of the effect. On the other hand, “Very low quality” suggests that there is very little confidence in the effect estimate and the reported estimate may be substantially different from what was measured. Two reviewers judged the following aspects: risk of bias, inconsistency, imprecision, indirect evidence, and publication bias. Disagreements were resolved by consensus with the third reviewer.

Study selection

Out of the 4144 studies initially identified, 1280 duplicate studies were removed, and an additional 2864 studies that were deemed irrelevant were excluded based on title and abstract screening. Subsequently, a thorough examination of the full text was conducted for the remaining 122 studies. A manual hand-search of the reference lists of selected articles was 5 studies. Finally, 30 studies were chosen as they met the predetermined inclusion criteria for this systematic review and meta-analysis. The study selection flowchart is shown in Fig.  1 .

figure 1

Flowchart of study selection

General characteristics of the included studies

The characteristics of the included studies are shown in Table  1 . The total sample size across these studies amounted to 6,228 participants. The earliest study in this review was conducted in 2003 [ 29 ], while the most recent study was conducted in 2023 [ 15 ], with 14 studies published after 2020. The study with the largest sample size was conducted by Schefold et al. [ 12 ], comprising 933 participants, while the study with the smallest sample size was carried out by Yılmaz et al. [ 19 ], including 40 participants. The methods employed to assess the incidence of PED exhibited variability among the studies. Specifically, one study employed VFSS [ 30 ], and four studies relied on FEES [ 9 , 29 , 31 , 32 ], and seven studies utilized SSA assessment tools [ 7 , 15 , 16 , 33 , 34 , 35 , 36 ]. Furthermore, six studies utilized BSE [ 5 , 10 , 17 , 37 , 38 , 39 ], two studies employed WST [ 12 , 40 ], two studies adopted PEDS [ 8 , 18 ], two studies utilized GUSS [ 19 , 41 ], and six studies employed other assessment tools [ 6 , 20 , 21 , 22 , 43 ,, 42 , 43 ] such as ASHA, FOIS, SSQ200, NPS-PED, MASA, and YSP.

Among all the studies, 23 studies recorded the assessment time for PED. Specifically, three studies assessed PED within ≤ 3 h post-extubation [ 6 , 9 , 12 ], four studies conducted assessments at 4–6 h post-extubation [ 15 , 16 , 33 , 36 ], nine studies assessed PED within ≤ 24 h post-extubation [ 7 , 8 , 17 , 18 , 19 , 31 , 34 , 40 , 41 ], three studies assessed PED within ≤ 48 h post-extubation [ 5 , 20 , 37 ], and four studies evaluated PED at > 24 h post-extubation [ 21 , 22 , 29 , 38 ]. In terms of study quality, eight of the included studies were categorized as high quality, while the remainder were deemed of moderate quality (see Additional 1: Tables S3, S4).

Meta-analysis results

Utilizing the random-effects model, the pooled incidence of PED was estimated to be 36% (95% CI 29.0%–44.0%, I 2  = 97.06%, p  < 0.001; Fig.  2 ), indicating a substantial degree of heterogeneity. Despite conducting additional subgroup analyses, the source of this high heterogeneity remained elusive. However, the results of the meta-regression analysis revealed that sample size ( p  < 0.001), assessment time ( p  = 0.027) and mean intubation time ( p  = 0.045) emerged as the significant factor contributing to the heterogeneity.

figure 2

Overall pooled incidence of PED in ICU

Subgroup analysis of incidence

The subgroup analyses yielded the following incidence rates of PED based on assessment time post-extubation: the incidence of PED within 3 h post-extubation was 31% (95% CI 8.0–59.0), 4–6 h was 28% (95% CI 22.0–35.0, I 2  = 78.56%, p  < 0.001), within 24 h was 41% (95% CI 33.0–49.0, I 2  = 88.99%, p  < 0.001), and within 48 h was 49%. In addition, the incidence of PED beyond 24 h post-extubation was 37% (95% CI 23.0–52.0, I 2  = 91.73%, p  < 0.001) (Additional file 1: Fig. S1). Furthermore, when analyzing studies based on sample size ( N ), the overall incidence of PED was found 51% (95% CI 39.0–63.0, I 2  = 87.11%, p  < 0.001) for studies with N  < 100 participants, 37% (95% CI 31.0–43.0, I 2  = 84.74%, p  < 0.001) for studies with 100 <  N  ≤ 200 participants, 32% (95% CI 20.0–46.0, I 2  = 97.16%, p  < 0.001) for studies with 200 <  N  ≤ 300 participants, and 16% (95% CI 8.0–26.0, I 2  = 97.07%, p  < 0.001) for studies with N  > 300 participants (see Additional file 1: Fig. S2). In addition, further analyses were conducted based on assessment tool, mean intubation time, mean age, ICU type, evaluator, publication year, study design and study quality (see Additional file 1: Figs. S3–S11).

Results of meta-regression analysis

In the meta-regression analysis, we examined PED assessment time, sample size, assessment tools, mean intubation time, mean age, ICU type, evaluator, publication year, study design and study quality as potential covariates to identify the source of heterogeneity (Table  2 ). The univariate meta-regression analysis revealed a statistically significant correlation between incidence and sample size, assessment time and mean intubation time. Bubble plots of meta-regression of covariates were shown in Additional (see Additional file 1: Figs. S12–S22).

Sensitivity analysis

Sensitivity analysis showed that the incidence of PED ranged from 29 to 44% (see Additional file 1: Fig. S23). The marginal variance between these results and the pooled incidence was minimal, suggesting that the result of the pooled incidence being stable and reliable.

Publication bias

In our study, publication bias was detected by the funnel plot (see Additional file 1: Fig. S24). We found that the adjusted effect size was similar to the original effect size ( p  < 0.01) (see Additional file 1: Fig. S25).

The certainty of evidence was very low for all comparisons performed according to the GRADE rating [ 28 ]. Thus, it can be considered that the certainty of the evidence regarding the incidence of PED in this review is very low (Table  3 ).

This systematic review and meta-analysis aimed to estimate the incidence of PED in ICU patients. The study revealed an overall incidence of PED in ICU patients who underwent orotracheal intubation to be 36.0%. This incidence rate was comparable to the incidence of dysphagia resulting from stroke (36.30%) [ 45 ] and aligned with the incidence of PED observed in ICU patients (36%) [ 46 ]. However, it was slightly lower than the 41% reported in the meta-analysis conducted by McIntyre et al. [ 4 ]. The incidence of PED among ICU patients who underwent orotracheal intubation was high, ICU medical professionals, especially nurses should raise awareness about PED. However, the included studies were characterized by diversity and heterogeneity in assessment time and assessment tools signaled the need for obtaining consensus on a range of issues, including assessment time and assessment tools appropriate for ICU.

Sample size

This review identified sample size as a significant source of heterogeneity ( p  < 0.001). Notably, the incidence of PED demonstrated a gradual decrease as the sample size of the studies increased. In larger scale studies, such as those conducted by McIntyre et al. and Schefold et al., simpler assessment tools are employed, allowing for quick completion [ 10 , 12 ]. However, the reliability and validity of some of these tools remain unverified. Conversely, certain studies are conducted by highly trained professionals using the gold standard for PED assessment [ 9 , 29 , 31 ], which, while more accurate, is also time-consuming and costly [ 47 ]. In addition, some ICU patients, due to their unstable conditions, are unable to complete the gold standard assessment, resulting in relatively smaller sample sizes for these studies.

In statistics, sample size is intricately linked to result stability, and the confidence intervals for subgroups with N  < 100 in this study exhibited a wider range, this might diminish the result precision and lead to larger deviations from the true value. However, as the sample size increased to 100 <  N  ≤ 300, the confidence intervals narrowed in comparison to other subgroups. Consequently, when sample size was 100 <  N  ≤ 300, the PED incidence rates were more close with the overall PED rate. According to the central limit theorem, if the sampling method remains consistent, results obtained from larger samples are more stable and closer to the true value [ 48 , 49 ]. It is worth noting that the confidence intervals for the subgroup with N  > 300 in this study were wider and demonstrated a larger divergence from the total PED incidence. Therefore, in future studies, careful consideration of the sample size, based on the detection rate of the assessment tool used, is advisable to ensure both the stability and reliability of the results.

Mean intubation time

This review identified mean intubation time as a significant source of heterogeneity ( p  = 0.045). Variances in mean intubation time among ICU patients undergoing orotracheal intubation can lead to differing degrees of mucosal damage in the oropharynx and larynx [ 2 , 50 ], thereby resulting in varying incidence rates of PED. For instance, Malandraki et al. have reported that prolonged intubation is associated with more than a 12-fold increased risk of moderate/severe dysphagia compared to shorter intubation durations, and this effect is particularly pronounced among elderly patients [ 51 ]. Moreover, studies have demonstrated that ICU patients with extended orotracheal intubation periods leading to PED also exhibit diminished tongue and lip strength, protracted oral food transportation, slower swallowing, and muscle weakness in swallowing-related muscles [ 24 , 46 ]. In view of these findings, ICU medical professionals should routinely evaluate the need for orotracheal intubation, strive to minimize the duration of mechanical ventilation.

PED assessment time

This review identified assessment time as a significant source of heterogeneity ( P  = 0.027). It is important to note that there are currently no established guidelines recommending the optimal timing for the initial assessment of PED in ICU patients who have undergone orotracheal intubation. Consequently, the assessment time varies widely across studies, resulting in PED incidence rates ranging from 28 to 49% among subgroups. Interestingly, the incidence of PED assessed within ≤ 3 h post-extubation appeared lower than that assessed within ≤ 24 h and ≤ 48 h post-extubation. This difference may be attributed to the study by Schefold et al., which featured a shorter intubation duration [ 12 ]. Therefore, the incidence of PED assessed within ≤ 3 h post-extubation in ICU patients with orotracheal intubation may be underestimated. Moreover, it is essential to highlight that some ICU patients, particularly those with severe illnesses and extended intubation time, may face challenges in complying with post-extubation instructions provided by healthcare personnel. Paradoxically, this group of patients is at a higher risk of developing PED, subsequently increasing their susceptibility to post-extubation pneumonia [ 11 ]. ICU professionals should evaluate swallowing function in patients post-extubation; early identification of patients at risk for PED to reduce complications. If PED is identified, nurses should follow-up assessments at multiple time to obtain a thorough comprehension of PED recovery trajectory among PED patients, which can serve as a foundation for determining the timing of clinical interventions accurately.

PED assessment tools

Despite the subgroup analyses and meta-regression results indicating that PED assessment tools did not contribute to the observed heterogeneity, it is important to acknowledge the wide array of assessment tools employed across the studies included in this review. The study’s findings revealed that the results of the GUSS and BSE assessments were most closely aligned with the gold standard screening results. In contrast, the PEDS assessment results tended to be higher than those derived from the gold standard assessment. Furthermore, the results of other assessment tools generally yielded lower incidence rates of PED, possibly attributable to variations in specificity or sensitivity. FEES and VFSS assessments are recognized for their meticulous scrutiny of patients’ swallowing processes, including the detection of food residue and aspiration, which may not be as comprehensively addressed by other assessment methods [ 51 ]. Assessment tools such as BSE, SSA, GUSS, WST, and other clinical methods do not provide direct visualization of the swallowing process. Instead, assessors rely on the observation of overt clinical symptoms during the patient’s initial food or water intake to judge the presence of PED. However, these methods may overlook occult aspiration in patients, potentially resulting in an underestimation of PED incidence. In contrast, PEDS, which primarily assesses patients based on their medical history and plumbing symptoms without screening for drinking or swallowing, may overestimate PED incidence. Considering the varying strengths and limitations of existing assessment tools, ICU professionals select appropriate PED assessment tool based on the characteristics of the critically ill patient. Early and rapid identification of PED, before the use of more complex and expensive assessment tools, minimizes the occurrence of complications in patients.

Strengths and weaknesses

In this study, we conducted a comprehensive analysis of the incidence of PED in ICU patients who underwent orotracheal intubation across various subgroups, revealing a notable degree of heterogeneity among the included studies. In our study, we have expanded the search as much as possible and included a total of 30 papers after screening, half of which were published after 2020. There are several limitations that should be considered when interpreting the results of this meta-analysis. First, there was varied heterogeneity between methodological of the study and estimates of prevalence that may question the appropriateness of calculating pooled prevalence estimates. However, in order to address this heterogeneity, we addressed the heterogeneity with applying a random-effect model and conducting subgroup analysis and meta-regression to explore three sources of heterogeneity. Second, the overall quality of evidence for the incidence of PED was rated as low according to GRADE. Higher quality original studies on the incidence of PED should be performed in the future. As a result, the findings should be interpreted with caution in such cases.

In conclusion, our systematic review and meta-analysis revealed a high incidence of PED among ICU patients who underwent orotracheal intubation. It is also worth noting that the incidence of PED in the ICU may be underestimated. It is expected to increase awareness about the issue of PED among ICU patients. It will be important to develop guidelines or consensus on the most appropriate PED assessment time and assessment tools to accurately assess the incidence of PED.

Relevance to clinical practice

Each year, a substantial number of critically ill patients, ranging from 13 to 20 million, necessitate endotracheal intubation to sustain their lives. Patients undergoing orotracheal intubation are at heightened risk of developing PED. PED has been linked to prolonged hospital and ICU length of stay, increased rates of pneumonia, and all-cause mortality. Early identification of high-risk patients by clinical nurses is critical for reduce patient burden and adverse outcomes.

Early and multiple times assessment: Future investigations should early assess PED in clinical practice, especially within 6 h post-extubation. Furthermore, we suggest for follow-up assessments at multiple time to obtain a thorough comprehension of PED incidence and the recovery trajectory among ICU patients who have undergone orotracheal intubation.

Assessment tool: Considering the varying strengths and limitations of existing assessment tools, ICU professionals should carefully evaluate the characteristics of critically ill patients and select appropriate assessment tools, before the use of more complex and expensive assessment tools.

Routinely evaluate the need for orotracheal intubation: Healthcare professionals should routinely evaluate the need for orotracheal intubation, strive to minimize the duration of mechanical ventilation.

Availability of data and materials

All data related to the present systematic review and meta-analysis are available from the original study corresponding author on reasonable request.

Abbreviations

Confidence interval

  • Intensive care unit

Post-extubation dysphagia

Sydney Swallow Questionnaire 200

Water swallowing test

Post-Extubation Dysphagia Screening Tool

Bedside swallow evaluation

The Yale swallow protocol

Mann Assessment of Swallowing Ability

American Speech-Language-Hearing Association

Video Fluoroscopic Swallowing Study

Fiberoptic endoscopic evaluation of swallowing

Gugging swallowing screen

Standardized Swallowing Assessment

Functional Oral Intake Scale

Nurse-performed screening for post-extubation dysphagia

Speech-language pathologists

Events of PED

Preferred Reporting Items for Systematic Reviews and Meta-analyses

International Prospective Register of Systematic Reviews

Wunsch H, Wagner J, Herlim M, Chong DH, Kramer AA, Halpern SD. ICU occupancy and mechanical ventilator use in the United States. Crit Care Med. 2013;41(12):2712–9.

Article   PubMed   Google Scholar  

Brodsky MB, Akst LM, Jedlanek E, Pandian V, Blackford B, Price C, Cole G, Mendez-Tellez PA, Hillel AT, Best SR, et al. Laryngeal injury and upper airway symptoms after endotracheal intubation during surgery: a systematic review and meta-analysis. Anesth Analg. 2021;132(4):1023–32.

Article   PubMed   PubMed Central   Google Scholar  

Brodsky MB, Chilukuri K, De I, Huang M, Needham DM. Coordination of pharyngeal and laryngeal swallowing events during single liquid swallows after oral endotracheal intubation. Am J Respir Crit Care Med. 2017;195:768–77.

Google Scholar  

McIntyre M, Doeltgen S, Dalton N, Koppa M, Chimunda T. Post-extubation dysphagia incidence in critically ill patients: a systematic review and meta-analysis. Aust Crit Care. 2021;34(1):67–75.

Tsai MH, Ku SC, Wang TG, Hsiao TY, Lee JJ, Chan DC, Huang GH, Chen C. Swallowing dysfunction following endotracheal intubation age matters. Medicine. 2016;95(24):e3871.

Leder SB, Warner HL, Suiter DM, Young NO, Bhattacharya B, Siner JM, Davis KA, Maerz LL, Rosenbaum SH, Marshall PS, et al. Evaluation of swallow function post-extubation: is it necessary to wait 24 hours? Ann Otol Rhinol Laryngol. 2019;128(7):619–24.

Zeng L, Song Y, Dong Y, Wu Q, Zhang L, Yu L, Gao L, Shi Y. Risk score for predicting dysphagia in patients after neurosurgery: a prospective observational trial. Front Neurol. 2021;12:605687.

Dan L, Yunfang C, Chengfen Y, Li T. Reliability and validity of the Chinese version of postextubation dysphagia screening tool for patients with mechanical ventilation. Tianjin J Nurs. 2022;30(2):161–5.

Troll C, Trapl-Grundschober M, Teuschl Y, Cerrito A, Compte MG, Siegemund M. A bedside swallowing screen for the identification of post-extubation dysphagia on the intensive care unit—validation of the Gugging Swallowing Screen (GUSS)—ICU. BMC Anesthesiol. 2023;23(1):122.

McInytre M, Doeltgen S, Shao C, Chimunda T. The incidence and clinical outcomes of postextubation dysphagia in a regional critical care setting. Aust Crit Care. 2022;35(2):107–12.

See KC, Peng SY, Phua J, Sum CL, Concepcion J. Nurse-performed screening for postextubation dysphagia: a retrospective cohort study in critically ill medical patients. Crit Care. 2016;20(1):326.

Schefold JC, Berger D, Zurcher P, Lensch M, Perren A, Jakob SM, Parviainen I, Takala J. Dysphagia in mechanically ventilated ICU patients (DYnAMICS): a prospective observational trial. Crit Care Med. 2017;45(12):2061–9.

Byun SE, Shon HC, Kim JW, Kim HK, Sim Y. Risk factors and prognostic implications of aspiration pneumonia in older hip fracture patients: a multicenter retrospective analysis. Geriatr Gerontol Int. 2019;19(2):119–23.

Jaillette E, Martin-Loeches I, Artigas A, Nseir S. Optimal care and design of the tracheal cuff in the critically ill patient. Ann Intensive Care. 2014;4(1):7.

Tang JY, Feng XQ, Huang XX, Zhang YP, Guo ZT, Chen L, Chen HT, Ying XX. Development and validation of a predictive model for patients with post-extubation dysphagia. World J Emerg Med. 2023;14(1):49–55.

Xia C, Ji J. The characteristics and predicators of post-extubation dysphagia in ICU patients with endotracheal intubation. Dysphagia. 2022;38:253.

Beduneau G, Souday V, Richard JC, Hamel JF, Carpentier D, Chretien JM, Bouchetemble P, Laccoureye L, Astier A, Tanguy V, et al. Persistent swallowing disorders after extubation in mechanically ventilated patients in ICU: a two-center prospective study. Ann Intensive Care. 2020;10(1):1–7.

Article   Google Scholar  

Johnson KL, Speirs L, Mitchell A, Przybyl H, Anderson D, Manos B, Schaenzer AT, Winchester K. Validation of a postextubation dysphagia screening tool for patients after prolonged endotracheal intubation. Am J Crit Care. 2018;27(2):89–96.

Yılmaz D, Mengi T, Sarı S. Post-extubation dysphagia and COVID-2019. Turkish J Neurol. 2021;27:21–5.

Oliveira A, Friche A, Salomão MS, Bougo GC, Vicente L. Predictive factors for oropharyngeal dysphagia after prolonged orotracheal intubation. Brazil J Otorhinolaryngol. 2018;84(6):722–8.

Yamada T, Ochiai R, Kotake Y. Changes in maximum tongue pressure and postoperative dysphagia in mechanically ventilated patients after cardiovascular surgery. Indian J Crit Care Med. 2022;26(12):1253–8.

Article   PubMed   PubMed Central   CAS   Google Scholar  

Brodsky MB, Huang M, Shanholtz C, Mendez-Tellez PA, Palmer JB, Colantuoni E, Needham DM. Recovery from dysphagia symptoms after oral endotracheal intubation in acute respiratory distress syndrome survivors. A 5-year longitudinal study. Ann Am Thorac Soc. 2017;14(3):376–83.

Skoretz SA, Yau TM, Ivanov J, Granton JT, Martino R. Dysphagia and associated risk factors following extubation in cardiovascular surgical patients. Dysphagia. 2014;29(6):647–54.

Park HS, Koo JH, Song SH. Association of post-extubation dysphagia with tongue weakness and somatosensory disturbance in non-neurologic critically ill patients. Ann Rehabil Med Arm. 2017;41(6):961–8.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, Shamseer L, Tetzlaff JM, Akl EA, Brennan SE, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Rev Esp Cardiol (Engl Ed). 2021;74(9):790–9.

Higgins JP, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, Savovic J, Schulz KF, Weeks L, Sterne JA. The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. BMJ Br Med J. 2011;343: d5928.

Lo CK, Mertz D, Loeb M. Newcastle-Ottawa Scale: comparing reviewers’ to authors’ assessments. BMC Med Res Methodol. 2014;14:45.

Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P, Schünemann HJ. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ-Br Med J. 2008;336(7650):924–6.

El SA, Okada M, Bhat A, Pietrantoni C. Swallowing disorders post orotracheal intubation in the elderly. Intensive Care Med. 2003;29(9):1451–5.

Yang WJ, Park E, Min YS, Huh JW, Kim AR, Oh HM, Nam TW, Jung TD. Association between clinical risk factors and severity of dysphagia after extubation based on a videofluoroscopic swallowing study. Korean J Intern Med. 2020;35(1):79.

Megarbane B, Hong TB, Kania R, Herman P, Baud FJ. Early laryngeal injury and complications because of endotracheal intubation in acutely poisoned patients: a prospective observational study. Clin Toxicol. 2010;48(4):331–6.

Scheel R, Pisegna JM, McNally E, Noordzij JP, Langmore SE. Endoscopic assessment of swallowing after prolonged intubation in the ICU setting. Ann Otol Rhinol Laryngol. 2016;125(1):43–52.

Fan GUO, Mingming WANG, Shengqiang ZOU. Analysis of risk factors and establishment of prediction model for post-extubation swallowing dysfunction in ICU patients with endotracheal intubation. Chin Nurs Res. 2020;34(19):3424–8.

Yaqian W: Localization and evaluation of reliability and validity of GuSS-ICU bedside swallowing screening tool. Master: Huzhou University; 2020.

Yun D, Yuan Z, Yanli Y. Risk factors and nursing strategies of the occurrences of acquired swallowing disorders after ICU patients treated with oral tracheal intubation and extubation. Med Equip. 2021;34(1):20–2.

JinTian Y. Study on the recovery of swallowing function and the real experience of patients with acquired swallowing disorder after cardiac surgery. Master: Nanjing University; 2020.

de Medeiros GC, Sassi FC, Mangilli LD, Zilberstein B, de Andrade C. Clinical dysphagia risk predictors after prolonged orotracheal intubation. Clinics. 2014;69(1):8–14.

Kwok AM, Davis JW, Cagle KM, Sue LP, Kaups KL. Post-extubation dysphagia in trauma patients: it’s hard to swallow. Am J Surg. 2013;206(6):924–7 ( 927–928 ).

Barker J, Martino R, Reichardt B, Hickey EJ, Ralph-Edwards A. Incidence and impact of dysphagia in patients receiving prolonged endotracheal intubation after cardiac surgery. Can J Surg. 2009;52(2):119–24.

PubMed   PubMed Central   Google Scholar  

Bordon A, Bokhari R, Sperry J, Testa D, Feinstein A, Ghaemmaghami V. Swallowing dysfunction after prolonged intubation: analysis of risk factors in trauma patients. Am J Surg. 2011;202(6):679–82.

Limin Z. The application of gugging swallowing screenin post-extubation swallowing dysfunction assessment after long-term intubation. Master. Tianjin Medical University; 2016.

Omura K, Komine A, Yanagigawa M, Chiba N, Osada M. Frequency and outcome of post-extubation dysphagia using nurse-performed swallowing screening protocol. Nurs Crit Care. 2019;24(2):70–5.

Regala M, Marvin S, Ehlenbach WJ. Association between postextubation dysphagia and long-term mortality among critically ill older adults. J Am Geriatr Soc. 2019;67(9):1895–901.

Meng PP, Zhang SC, Han C, Wang Q, Bai GT, Yue SW. The occurrence rate of swallowing disorders after stroke patients in Asia: a PRISMA-compliant systematic review and meta-analysis. J Stroke Cerebrovasc Dis Off J Nat Stroke Assoc. 2020;29(10): 105113.

Yingli H, Mengxin C, Donglei S. Incidence and influencing factors of post-extubation dysphagia among patients with mechanical ventilation: a meta-analysis. Chin J Modern Nurs. 2019;25(17):2158–63.

Spronk PE, Spronk LEJ, Egerod I, McGaughey J, McRae J, Rose L, Brodsky MB, Brodsky MB, Rose L, Lut J, et al. Dysphagia in intensive care evaluation (DICE): an international cross-sectional survey. Dysphagia. 2022;37(6):1451–60.

Pourhoseingholi MA, Vahedi M, Rahimzadeh M. Sample size calculation in medical studies. Gastroenterol Hepatol Bed Bench. 2013;6(1):14–7.

Faber J, Fonseca LM. How sample size influences research outcomes. Dental Press J Orthod. 2014;19(4):27–9.

Zuercher P, Moret CS, Dziewas R, Schefold JC. Dysphagia in the intensive care unit: epidemiology, mechanisms, and clinical management. Crit Care. 2019;23(1):103.

Malandraki GA, Markaki V, Georgopoulos VC, Psychogios L, Nanas S. Postextubation dysphagia in critical patients: a first report from the largest step-down intensive care unit in Greece. Am J Speech Lang Pathol. 2016;25(2):150–6.

Ambika RS, Datta B, Manjula BV, Warawantkar UV, Thomas AM. Fiberoptic endoscopic evaluation of swallow (FEES) in intensive care unit patients post extubation. Indian J Otolaryngol Head Neck Surg. 2019;71(2):266–70.

Article   PubMed   CAS   Google Scholar  

Download references

No funding.

Author information

Weixia Yu and Limi Dan contributed as the co-first authors.

Authors and Affiliations

Department of Nursing, the First Affiliated Hospital of Soochow University, Suzhou, 215006, China

Weixia Yu, Limi Dan, Jianzheng Cai, Yuyu Wang, Qingling Wang, Yingying Zhang & Xin Wang

You can also search for this author in PubMed   Google Scholar

Contributions

Weixia Yu, Limi Dan, Jianzheng Cai, and Yuyu Wang developed the original concept of this systematic review and meta-analysis. Weixia Yu, Limi Dan, Jianzheng Cai and Yuyu Wang contributed to the screening of eligible studies, data extraction, and data synthesis. Weixia Yu, Limi Dan, Jianzheng Cai, Yuyu Wang and Qingling Wang drafted the first version of the manuscript. Yingying Zhang, Qingling Wang and Xin Wang prepared the tables and figures. All the authors have edited and contributed for intellectual content. All the authors read and approved the final manuscript and take public responsibility for it.

Corresponding authors

Correspondence to Jianzheng Cai or Yuyu Wang .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

40001_2024_2024_moesm1_esm.docx.

Additional file 1: Table S1. PRISMA 2020 Checklist. Table S2. Search strategy. Table S3. Quality evaluation results of the cohort studies. Table S4. Quality evaluation results of the cross-sectional study. Fig. S1. Subgroup analysis of the incidence of PED by assessment time. Fig. S2. Subgroup analysis of the incidence of PED by sample size. Fig. S3. Incidence of PED by assessment tool. Fig. S4. Incidence of PED by mean intubation time. Fig. S5 Incidence of PED by mean age. Fig. S6. Incidence of PED by ICU type. Fig. S7. Incidence of PED by evaluator. Fig. S8. Incidence of PED by year of publication. Fig. S9. Incidence of PED by study design. Fig. S10. Incidence of PED by quality of cohort study. Fig. S11. Incidence of PED by quality of Cross-sectional study. Fig. S12. Bubble plot of meta-regression result for evaluate time as a covariate. Fig. S13. Bubble plot of meta-regression result for sample size as a covariate. Fig. S14. Bubble plot of meta-regression result for assessment tool as a covariate. Fig. S15. Bubble plot of meta-regression result for mean intubation time as a covariate. Fig. S16. Bubble plot of meta-regression result for mean age as a covariate. Fig. S17. Bubble plot of meta-regression result for ICU type as a covariate. Fig. S18. Bubble plot of meta-regression result for evaluator as a covariate. Fig. S19. Bubble plot of meta-regression result for year of publication as a covariate. Fig. S20. Bubble plot of meta-regression result for study design as a covariate. Fig. S21. Bubble plot of meta-regression result for quality of cohort study as a covariate. Fig. S22. Bubble plot of meta-regression result for quality of cross-sectional study as a covariate. Fig. S23. Sensitivity analysis of PED. Fig. S24. Publication bias assessment plot. Fig. S25. Publication bias assessment plot. “Trim and Full test” method.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Yu, W., Dan, L., Cai, J. et al. Incidence of post-extubation dysphagia among critical care patients undergoing orotracheal intubation: a systematic review and meta-analysis. Eur J Med Res 29 , 444 (2024). https://doi.org/10.1186/s40001-024-02024-x

Download citation

Received : 19 December 2023

Accepted : 12 August 2024

Published : 31 August 2024

DOI : https://doi.org/10.1186/s40001-024-02024-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Orotracheal intubation
  • Post-extubation
  • Systematic review
  • Meta-analysis

European Journal of Medical Research

ISSN: 2047-783X

research data analysis tools

U.S. flag

An official website of the United States government

Gross Domestic Product (Second Estimate), Corporate Profits (Preliminary Estimate), Second Quarter 2024

  • News Release
  • Related Materials
  • Additional Information

Real gross domestic product (GDP) increased at an annual rate of 3.0 percent in the second quarter of 2024 (table 1), according to the "second" estimate released by the U.S. Bureau of Economic Analysis. In the first quarter, real GDP increased 1.4 percent.

The GDP estimate released today is based on more complete source data than were available for the "advance" estimate issued last month.  In the advance estimate, the increase in real GDP was 2.8 percent. The update primarily reflected an upward revision to consumer spending (refer to "Updates to GDP").

Real GDP: Percent change from preceding quarter

The increase in real GDP primarily reflected increases in consumer spending, private inventory investment, and nonresidential fixed investment. Imports, which are a subtraction in the calculation of GDP, increased (table 2).

Compared to the first quarter, the acceleration in real GDP in the second quarter primarily reflected an upturn in private inventory investment and an acceleration in consumer spending. These movements were partly offset by a downturn in residential fixed investment.

Current‑dollar GDP increased 5.5 percent at an annual rate, or $383.2 billion, in the second quarter to a level of $28.65 trillion, an upward revision of $23.2 billion from the previous estimate (tables 1 and 3). More information on the source data that underlie the estimates is available in the " Key Source Data and Assumptions " file on BEA's website.

The price index for gross domestic purchases increased 2.4 percent in the second quarter, an upward revision of 0.1 percentage point from the previous estimate. The personal consumption expenditures (PCE) price index increased 2.5 percent, a downward revision of 0.1 percentage point. Excluding food and energy prices, the PCE price index increased 2.8 percent, a downward revision of 0.1 percentage point.

Personal Income

Current-dollar personal income increased $233.6 billion in the second quarter, a downward revision of $4.0 billion from the previous estimate. The increase primarily reflected increases in compensation and personal current transfer receipts (table 8).

Disposable personal income increased $183.0 billion, or 3.6 percent, in the second quarter, a downward revision of $3.2 billion from the previous estimate. Real disposable personal income increased 1.0 percent, unrevised from the prior estimate.

Personal saving was $686.4 billion in the second quarter, a downward revision of $34.1 billion from the previous estimate. The personal saving rate —personal saving as a percentage of disposable personal income—was 3.3 percent in the second quarter, a downward revision of 0.2 percentage point.

Gross Domestic Income and Corporate Profits

Real gross domestic income (GDI) increased 1.3 percent in the second quarter, the same as in the first quarter. The average of real GDP and real GDI , a supplemental measure of U.S. economic activity that equally weights GDP and GDI, increased 2.1 percent in the second quarter, compared with an increase of 1.4 percent in the first quarter (table 1).

Profits from current production (corporate profits with inventory valuation and capital consumption adjustments) increased $57.6 billion in the second quarter, in contrast to a decrease of $47.1 billion in the first quarter (table 10).

Profits of domestic financial corporations increased $46.4 billion in the second quarter, compared with an increase of $65.0 billion in the first quarter. Profits of domestic nonfinancial corporations increased $29.2 billion, in contrast to a decrease of $114.5 billion. Rest-of-the-world profits decreased $18.0 billion, in contrast to an increase of $2.3 billion. In the second quarter, receipts decreased $6.2 billion, and payments increased $11.8 billion.

Updates to GDP

With the second estimate, an upward revision to consumer spending was partly offset by downward revisions to nonresidential fixed investment, exports, private inventory investment, federal government spending, state and local government spending, and residential fixed investment. Imports were revised up. For more information, refer to the Technical Note . For information on updates to GDP, refer to the "Additional Information" section that follows.

 
Real GDP 2.8 3.0
Current-dollar GDP 5.2 5.5
Real GDI 1.3
Average of Real GDP and Real GDI 2.1
Gross domestic purchases price index 2.3 2.4
PCE price index 2.6 2.5
PCE price index excluding food and energy 2.9 2.8

First Quarter Wages and Salaries

BEA's standard practice for first-quarter estimates of wages and salaries is to incorporate data from the Bureau of Labor Statistics' Quarterly Census of Employment and Wages (QCEW) program as part of the annual update of the National Economic Accounts. New QCEW data for the first quarter of 2024 will be incorporated in next month's release along with the 2024 Annual Update of the National Economic Accounts (refer to box below for details).

BEA will release results from the 2024 annual update of the National Economic Accounts, which include the National Income and Product Accounts as well as the Industry Economic Accounts, on September 26, 2024. The update will present revised statistics for GDP, GDP by Industry, and gross domestic income. For details, refer to Information on 2024 Annual Updates to the National, Industry, and State and Local Economic Accounts .

*          *          *

Next release, September 26, 2024, at 8:30 a.m. EDT Gross Domestic Product (Third Estimate) Corporate Profits (Revised Estimate) Gross Domestic Product by Industry Second Quarter 2024 and Annual Update

Full Release & Tables (PDF)

Technical note (pdf), tables only (excel), release highlights (pdf), historical comparisons (pdf), key source data and assumptions (excel), revision information.

Additional resources available at www.bea.gov :

  • Stay informed about BEA developments by reading the BEA blog , signing up for BEA's email subscription service , or following BEA on X, formerly known as Twitter @BEA_News .
  • Historical time series for these estimates can be accessed in BEA's interactive data application .
  • Access BEA data by registering for BEA's data Application Programming Interface (API).
  • For more on BEA's statistics, refer to our online journal, the Survey of Current Business .
  • BEA's news release schedule
  • NIPA Handbook : Concepts and Methods of the U.S. National Income and Product Accounts

Definitions

Gross domestic product (GDP), or value added , is the value of the goods and services produced by the nation's economy less the value of the goods and services used up in production. GDP is also equal to the sum of personal consumption expenditures, gross private domestic investment, net exports of goods and services, and government consumption expenditures and gross investment.

Gross domestic income (GDI) is the sum of incomes earned and costs incurred in the production of GDP. In national economic accounting, GDP and GDI are conceptually equal. In practice, GDP and GDI differ because they are constructed using largely independent source data.

Gross output is the value of the goods and services produced by the nation's economy. It is principally measured using industry sales or receipts, including sales to final users (GDP) and sales to other industries (intermediate inputs).

Current-dollar estimates are valued in the prices of the period when the transactions occurred—that is, at "market value." Also referred to as "nominal estimates" or as "current-price estimates."

Real values are inflation-adjusted estimates—that is, estimates that exclude the effects of price changes.

The gross domestic purchases price index measures the prices of final goods and services purchased by U.S. residents.

The personal consumption expenditure price index measures the prices paid for the goods and services purchased by, or on the behalf of, "persons."

Personal income is the income received by, or on behalf of, all persons from all sources: from participation as laborers in production, from owning a home or business, from the ownership of financial assets, and from government and business in the form of transfers. It includes income from domestic sources as well as the rest of world. It does not include realized or unrealized capital gains or losses.

Disposable personal income is the income available to persons for spending or saving. It is equal to personal income less personal current taxes.

Personal outlays is the sum of personal consumption expenditures, personal interest payments, and personal current transfer payments.

Personal saving is personal income less personal outlays and personal current taxes.

The personal saving rate is personal saving as a percentage of disposable personal income.

Profits from current production , referred to as corporate profits with inventory valuation adjustment (IVA) and capital consumption (CCAdj) adjustment in the National Income and Product Accounts (NIPAs), is a measure of the net income of corporations before deducting income taxes that is consistent with the value of goods and services measured in GDP. The IVA and CCAdj are adjustments that convert inventory withdrawals and depreciation of fixed assets reported on a tax-return, historical-cost basis to the current-cost economic measures used in the national income and product accounts. Profits for domestic industries reflect profits for all corporations located within the geographic borders of the United States. The rest-of-the-world (ROW) component of profits is measured as the difference between profits received from ROW and profits paid to ROW.

For more definitions, refer to the Glossary: National Income and Product Accounts .

Statistical conventions

Annual-vs-quarterly rates . Quarterly seasonally adjusted values are expressed at annual rates, unless otherwise specified. This convention is used for BEA's featured, seasonally adjusted measures to facilitate comparisons with related and historical data. For details, refer to the FAQ " Why does BEA publish estimates at annual rates? "

Quarterly not seasonally adjusted values are expressed only at quarterly rates.

Percent changes . Percent changes in quarterly seasonally adjusted series are displayed at annual rates, unless otherwise specified. For details, refer to the FAQ " How is average annual growth calculated? " and " Why does BEA publish percent changes in quarterly series at annual rates? " Percent changes in quarterly not seasonally adjusted values are calculated from the same quarter one year ago. All published percent changes are calculated from unrounded data.

Calendar years and quarters . Unless noted otherwise, annual and quarterly data are presented on a calendar basis.

Quantities and prices . Quantities, or "real" volume measures, and prices are expressed as index numbers with a specified reference year equal to 100 (currently 2017). Quantity and price indexes are calculated using a Fisher-chained weighted formula that incorporates weights from two adjacent periods (quarters for quarterly data and annuals for annual data). For details on the calculation of quantity and price indexes, refer to Chapter 4: Estimating Methods in the NIPA Handbook .

Chained-dollar values are calculated by multiplying the quantity index by the current dollar value in the reference year (2017) and then dividing by 100. Percent changes calculated from real quantity indexes and chained-dollar levels are conceptually the same; any differences are due to rounding. Chained-dollar values are not additive because the relative weights for a given period differ from those of the reference year. In tables that display chained-dollar values, a "residual" line shows the difference between the sum of detailed chained-dollar series and its corresponding aggregate.

BEA releases three vintages of the current quarterly estimate for GDP. "Advance" estimates are released near the end of the first month following the end of the quarter and are based on source data that are incomplete or subject to further revision by the source agency. "Second" and "third" estimates are released near the end of the second and third months, respectively, and are based on more detailed and more comprehensive data as they become available.

The table below shows the average revisions to the quarterly percent changes in real GDP between different estimate vintages, without regard to sign.

-->
Vintage Average Revision
Without Regard to Sign
(percentage points, annual rates)
Advance to second 0.5
Advance to third 0.6
Second to third 0.3
1.2
on the BEA Website.

Annual and comprehensive updates are released in late September. Annual updates generally cover at least the five most recent calendar years (and their associated quarters) and incorporate newly available major annual source data as well as some changes in methods and definitions to improve the accounts. Comprehensive (or benchmark) updates are carried out at about 5-year intervals and incorporate major periodic source data, as well as major conceptual improvements.

Unlike GDP, advance current quarterly estimates of GDI and corporate profits are not released because data on domestic profits and net interest of domestic industries are not available. For fourth quarter estimates, these data are not available until the third estimate.

GDP by industry and gross output estimates are released with the third estimate of GDP.

research data analysis tools

Maintenance work is planned from 22:00 BST on Monday 16th September 2024 to 22:00 BST on Tuesday 17th September 2024.

During this time the performance of our website may be affected - searches may run slowly, some pages may be temporarily unavailable, and you may be unable to access content. If this happens, please try refreshing your web browser or try waiting two to three minutes before trying again.

We apologise for any inconvenience this might cause and thank you for your patience.

research data analysis tools

Chemical Science

Musketeer: a software tool for the analysis of titration data †.

ORCID logo

* Corresponding authors

a Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK E-mail: [email protected]

Musketeer is a powerful open-source software tool for the analysis of titration data, featuring a simple cross-platform graphical interface for importing data directly from UV-vis, fluorescence and NMR spectrometers, or from spreadsheets. The fast data analysis algorithm can be used to obtain equilibrium constants for simple binding isotherms, as well as for more complicated systems with multiple competing equilibria. Applications of Musketeer for the analysis of a range of different supramolecular and biomolecular systems are illustrated, including titrations with multiple spectroscopically active species, competitive binding assays, denaturation experiments, optimisation of concentrations as variables. The software also includes a number of tools that can be used to select the binding isotherm that represents the best model to describe a dataset.

Graphical abstract: Musketeer: a software tool for the analysis of titration data

Supplementary files

  • Supplementary information ZIP (332K)
  • Supplementary information PDF (491K)

Article information

research data analysis tools

Download Citation

Permissions.

research data analysis tools

Musketeer: a software tool for the analysis of titration data

D. O. Soloviev and C. A. Hunter, Chem. Sci. , 2024, Advance Article , DOI: 10.1039/D4SC03354J

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence . You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content .

Social activity

Search articles by author.

This article has not yet been cited.

Advertisements

NASA Logo

NASA’s Mini BurstCube Mission Detects Mega Blast

The shoebox-sized BurstCube satellite has observed its first gamma-ray burst, the most powerful kind of explosion in the universe, according to a recent analysis of observations collected over the last several months.

“We’re excited to collect science data,” said Sean Semper, BurstCube’s lead engineer at NASA’s  Goddard Space Flight Center  in Greenbelt, Maryland. “It’s an important milestone for the team and for the many early career engineers and scientists that have been part of the mission.”

The event, called GRB 240629A, occurred on June 29 in the southern constellation Microscopium. The team announced the discovery in a GCN (General Coordinates Network) circular on August 29.

The BurstCube and SNOOPI satellites deploy into space in this photograph.

BurstCube  deployed into orbit April 18 from the International Space Station, following a March 21  launch .

The mission was designed to detect, locate, and study short  gamma-ray bursts , brief flashes of high-energy light created when superdense objects like  neutron stars  collide. These collisions also produce  heavy elements  like gold and iodine, an essential ingredient for life as we know it. 

BurstCube is the first CubeSat to use NASA’s  TDRS (Tracking and Data Relay Satellite)  system, a constellation of specialized communications spacecraft. Data relayed by TDRS (pronounced “tee-driss”) help coordinate rapid follow-up measurements by other observatories in space and on the ground through NASA’s GCN .

BurstCube also regularly beams data back to Earth using the Direct to Earth system — both it and TDRS are part of NASA’s  Near Space Network .

After BurstCube deployed from the space station, the team discovered that one of the two solar panels failed to fully extend. It obscures the view of the mission’s star tracker, which hinders orienting the spacecraft in a way that minimizes drag. The team originally hoped to operate BurstCube for 12-18 months, but now estimates the increased drag will cause the satellite to re-enter the atmosphere in September. 

“I’m proud of how the team responded to the situation and is making the best use of the time we have in orbit,” said Jeremy Perkins, BurstCube’s principal investigator at Goddard. “Small missions like BurstCube not only provide an opportunity to do great science and test new technologies, like our mission’s gamma-ray detector, but also important learning opportunities for the up-and-coming members of the astrophysics community.”

BurstCube is led by Goddard. It’s funded by the  Science Mission Directorate’s Astrophysics Division  at NASA Headquarters. The BurstCube collaboration includes: the University of Alabama in Huntsville; the University of Maryland, College Park; the Universities Space Research Association in Washington; the Naval Research Laboratory in Washington; and NASA’s Marshall Space Flight Center in Huntsville.

By  Jeanette Kazmierczak NASA’s Goddard Space Flight Center , Greenbelt, Md.

Media Contact: Claire Andreoli 301-286-1940 [email protected] NASA’s Goddard Space Flight Center, Greenbelt, Md.

Related Terms

  • Astrophysics
  • Gamma-Ray Bursts
  • Goddard Space Flight Center
  • ISS Research
  • Small Satellite Missions
  • The Universe

PW Skills | Blog

Data Analysis Techniques in Research – Methods, Tools & Examples

' src=

Varun Saharawat is a seasoned professional in the fields of SEO and content writing. With a profound knowledge of the intricate aspects of these disciplines, Varun has established himself as a valuable asset in the world of digital marketing and online content creation.

Data analysis techniques in research are essential because they allow researchers to derive meaningful insights from data sets to support their hypotheses or research objectives.

data analysis techniques in research

Data Analysis Techniques in Research : While various groups, institutions, and professionals may have diverse approaches to data analysis, a universal definition captures its essence. Data analysis involves refining, transforming, and interpreting raw data to derive actionable insights that guide informed decision-making for businesses.

A straightforward illustration of data analysis emerges when we make everyday decisions, basing our choices on past experiences or predictions of potential outcomes.

If you want to learn more about this topic and acquire valuable skills that will set you apart in today’s data-driven world, we highly recommend enrolling in the Data Analytics Course by Physics Wallah . And as a special offer for our readers, use the coupon code “READER” to get a discount on this course.

Table of Contents

What is Data Analysis?

Data analysis is the systematic process of inspecting, cleaning, transforming, and interpreting data with the objective of discovering valuable insights and drawing meaningful conclusions. This process involves several steps:

  • Inspecting : Initial examination of data to understand its structure, quality, and completeness.
  • Cleaning : Removing errors, inconsistencies, or irrelevant information to ensure accurate analysis.
  • Transforming : Converting data into a format suitable for analysis, such as normalization or aggregation.
  • Interpreting : Analyzing the transformed data to identify patterns, trends, and relationships.

Types of Data Analysis Techniques in Research

Data analysis techniques in research are categorized into qualitative and quantitative methods, each with its specific approaches and tools. These techniques are instrumental in extracting meaningful insights, patterns, and relationships from data to support informed decision-making, validate hypotheses, and derive actionable recommendations. Below is an in-depth exploration of the various types of data analysis techniques commonly employed in research:

1) Qualitative Analysis:

Definition: Qualitative analysis focuses on understanding non-numerical data, such as opinions, concepts, or experiences, to derive insights into human behavior, attitudes, and perceptions.

  • Content Analysis: Examines textual data, such as interview transcripts, articles, or open-ended survey responses, to identify themes, patterns, or trends.
  • Narrative Analysis: Analyzes personal stories or narratives to understand individuals’ experiences, emotions, or perspectives.
  • Ethnographic Studies: Involves observing and analyzing cultural practices, behaviors, and norms within specific communities or settings.

2) Quantitative Analysis:

Quantitative analysis emphasizes numerical data and employs statistical methods to explore relationships, patterns, and trends. It encompasses several approaches:

Descriptive Analysis:

  • Frequency Distribution: Represents the number of occurrences of distinct values within a dataset.
  • Central Tendency: Measures such as mean, median, and mode provide insights into the central values of a dataset.
  • Dispersion: Techniques like variance and standard deviation indicate the spread or variability of data.

Diagnostic Analysis:

  • Regression Analysis: Assesses the relationship between dependent and independent variables, enabling prediction or understanding causality.
  • ANOVA (Analysis of Variance): Examines differences between groups to identify significant variations or effects.

Predictive Analysis:

  • Time Series Forecasting: Uses historical data points to predict future trends or outcomes.
  • Machine Learning Algorithms: Techniques like decision trees, random forests, and neural networks predict outcomes based on patterns in data.

Prescriptive Analysis:

  • Optimization Models: Utilizes linear programming, integer programming, or other optimization techniques to identify the best solutions or strategies.
  • Simulation: Mimics real-world scenarios to evaluate various strategies or decisions and determine optimal outcomes.

Specific Techniques:

  • Monte Carlo Simulation: Models probabilistic outcomes to assess risk and uncertainty.
  • Factor Analysis: Reduces the dimensionality of data by identifying underlying factors or components.
  • Cohort Analysis: Studies specific groups or cohorts over time to understand trends, behaviors, or patterns within these groups.
  • Cluster Analysis: Classifies objects or individuals into homogeneous groups or clusters based on similarities or attributes.
  • Sentiment Analysis: Uses natural language processing and machine learning techniques to determine sentiment, emotions, or opinions from textual data.

Also Read: AI and Predictive Analytics: Examples, Tools, Uses, Ai Vs Predictive Analytics

Data Analysis Techniques in Research Examples

To provide a clearer understanding of how data analysis techniques are applied in research, let’s consider a hypothetical research study focused on evaluating the impact of online learning platforms on students’ academic performance.

Research Objective:

Determine if students using online learning platforms achieve higher academic performance compared to those relying solely on traditional classroom instruction.

Data Collection:

  • Quantitative Data: Academic scores (grades) of students using online platforms and those using traditional classroom methods.
  • Qualitative Data: Feedback from students regarding their learning experiences, challenges faced, and preferences.

Data Analysis Techniques Applied:

1) Descriptive Analysis:

  • Calculate the mean, median, and mode of academic scores for both groups.
  • Create frequency distributions to represent the distribution of grades in each group.

2) Diagnostic Analysis:

  • Conduct an Analysis of Variance (ANOVA) to determine if there’s a statistically significant difference in academic scores between the two groups.
  • Perform Regression Analysis to assess the relationship between the time spent on online platforms and academic performance.

3) Predictive Analysis:

  • Utilize Time Series Forecasting to predict future academic performance trends based on historical data.
  • Implement Machine Learning algorithms to develop a predictive model that identifies factors contributing to academic success on online platforms.

4) Prescriptive Analysis:

  • Apply Optimization Models to identify the optimal combination of online learning resources (e.g., video lectures, interactive quizzes) that maximize academic performance.
  • Use Simulation Techniques to evaluate different scenarios, such as varying student engagement levels with online resources, to determine the most effective strategies for improving learning outcomes.

5) Specific Techniques:

  • Conduct Factor Analysis on qualitative feedback to identify common themes or factors influencing students’ perceptions and experiences with online learning.
  • Perform Cluster Analysis to segment students based on their engagement levels, preferences, or academic outcomes, enabling targeted interventions or personalized learning strategies.
  • Apply Sentiment Analysis on textual feedback to categorize students’ sentiments as positive, negative, or neutral regarding online learning experiences.

By applying a combination of qualitative and quantitative data analysis techniques, this research example aims to provide comprehensive insights into the effectiveness of online learning platforms.

Also Read: Learning Path to Become a Data Analyst in 2024

Data Analysis Techniques in Quantitative Research

Quantitative research involves collecting numerical data to examine relationships, test hypotheses, and make predictions. Various data analysis techniques are employed to interpret and draw conclusions from quantitative data. Here are some key data analysis techniques commonly used in quantitative research:

1) Descriptive Statistics:

  • Description: Descriptive statistics are used to summarize and describe the main aspects of a dataset, such as central tendency (mean, median, mode), variability (range, variance, standard deviation), and distribution (skewness, kurtosis).
  • Applications: Summarizing data, identifying patterns, and providing initial insights into the dataset.

2) Inferential Statistics:

  • Description: Inferential statistics involve making predictions or inferences about a population based on a sample of data. This technique includes hypothesis testing, confidence intervals, t-tests, chi-square tests, analysis of variance (ANOVA), regression analysis, and correlation analysis.
  • Applications: Testing hypotheses, making predictions, and generalizing findings from a sample to a larger population.

3) Regression Analysis:

  • Description: Regression analysis is a statistical technique used to model and examine the relationship between a dependent variable and one or more independent variables. Linear regression, multiple regression, logistic regression, and nonlinear regression are common types of regression analysis .
  • Applications: Predicting outcomes, identifying relationships between variables, and understanding the impact of independent variables on the dependent variable.

4) Correlation Analysis:

  • Description: Correlation analysis is used to measure and assess the strength and direction of the relationship between two or more variables. The Pearson correlation coefficient, Spearman rank correlation coefficient, and Kendall’s tau are commonly used measures of correlation.
  • Applications: Identifying associations between variables and assessing the degree and nature of the relationship.

5) Factor Analysis:

  • Description: Factor analysis is a multivariate statistical technique used to identify and analyze underlying relationships or factors among a set of observed variables. It helps in reducing the dimensionality of data and identifying latent variables or constructs.
  • Applications: Identifying underlying factors or constructs, simplifying data structures, and understanding the underlying relationships among variables.

6) Time Series Analysis:

  • Description: Time series analysis involves analyzing data collected or recorded over a specific period at regular intervals to identify patterns, trends, and seasonality. Techniques such as moving averages, exponential smoothing, autoregressive integrated moving average (ARIMA), and Fourier analysis are used.
  • Applications: Forecasting future trends, analyzing seasonal patterns, and understanding time-dependent relationships in data.

7) ANOVA (Analysis of Variance):

  • Description: Analysis of variance (ANOVA) is a statistical technique used to analyze and compare the means of two or more groups or treatments to determine if they are statistically different from each other. One-way ANOVA, two-way ANOVA, and MANOVA (Multivariate Analysis of Variance) are common types of ANOVA.
  • Applications: Comparing group means, testing hypotheses, and determining the effects of categorical independent variables on a continuous dependent variable.

8) Chi-Square Tests:

  • Description: Chi-square tests are non-parametric statistical tests used to assess the association between categorical variables in a contingency table. The Chi-square test of independence, goodness-of-fit test, and test of homogeneity are common chi-square tests.
  • Applications: Testing relationships between categorical variables, assessing goodness-of-fit, and evaluating independence.

These quantitative data analysis techniques provide researchers with valuable tools and methods to analyze, interpret, and derive meaningful insights from numerical data. The selection of a specific technique often depends on the research objectives, the nature of the data, and the underlying assumptions of the statistical methods being used.

Also Read: Analysis vs. Analytics: How Are They Different?

Data Analysis Methods

Data analysis methods refer to the techniques and procedures used to analyze, interpret, and draw conclusions from data. These methods are essential for transforming raw data into meaningful insights, facilitating decision-making processes, and driving strategies across various fields. Here are some common data analysis methods:

  • Description: Descriptive statistics summarize and organize data to provide a clear and concise overview of the dataset. Measures such as mean, median, mode, range, variance, and standard deviation are commonly used.
  • Description: Inferential statistics involve making predictions or inferences about a population based on a sample of data. Techniques such as hypothesis testing, confidence intervals, and regression analysis are used.

3) Exploratory Data Analysis (EDA):

  • Description: EDA techniques involve visually exploring and analyzing data to discover patterns, relationships, anomalies, and insights. Methods such as scatter plots, histograms, box plots, and correlation matrices are utilized.
  • Applications: Identifying trends, patterns, outliers, and relationships within the dataset.

4) Predictive Analytics:

  • Description: Predictive analytics use statistical algorithms and machine learning techniques to analyze historical data and make predictions about future events or outcomes. Techniques such as regression analysis, time series forecasting, and machine learning algorithms (e.g., decision trees, random forests, neural networks) are employed.
  • Applications: Forecasting future trends, predicting outcomes, and identifying potential risks or opportunities.

5) Prescriptive Analytics:

  • Description: Prescriptive analytics involve analyzing data to recommend actions or strategies that optimize specific objectives or outcomes. Optimization techniques, simulation models, and decision-making algorithms are utilized.
  • Applications: Recommending optimal strategies, decision-making support, and resource allocation.

6) Qualitative Data Analysis:

  • Description: Qualitative data analysis involves analyzing non-numerical data, such as text, images, videos, or audio, to identify themes, patterns, and insights. Methods such as content analysis, thematic analysis, and narrative analysis are used.
  • Applications: Understanding human behavior, attitudes, perceptions, and experiences.

7) Big Data Analytics:

  • Description: Big data analytics methods are designed to analyze large volumes of structured and unstructured data to extract valuable insights. Technologies such as Hadoop, Spark, and NoSQL databases are used to process and analyze big data.
  • Applications: Analyzing large datasets, identifying trends, patterns, and insights from big data sources.

8) Text Analytics:

  • Description: Text analytics methods involve analyzing textual data, such as customer reviews, social media posts, emails, and documents, to extract meaningful information and insights. Techniques such as sentiment analysis, text mining, and natural language processing (NLP) are used.
  • Applications: Analyzing customer feedback, monitoring brand reputation, and extracting insights from textual data sources.

These data analysis methods are instrumental in transforming data into actionable insights, informing decision-making processes, and driving organizational success across various sectors, including business, healthcare, finance, marketing, and research. The selection of a specific method often depends on the nature of the data, the research objectives, and the analytical requirements of the project or organization.

Also Read: Quantitative Data Analysis: Types, Analysis & Examples

Data Analysis Tools

Data analysis tools are essential instruments that facilitate the process of examining, cleaning, transforming, and modeling data to uncover useful information, make informed decisions, and drive strategies. Here are some prominent data analysis tools widely used across various industries:

1) Microsoft Excel:

  • Description: A spreadsheet software that offers basic to advanced data analysis features, including pivot tables, data visualization tools, and statistical functions.
  • Applications: Data cleaning, basic statistical analysis, visualization, and reporting.

2) R Programming Language :

  • Description: An open-source programming language specifically designed for statistical computing and data visualization.
  • Applications: Advanced statistical analysis, data manipulation, visualization, and machine learning.

3) Python (with Libraries like Pandas, NumPy, Matplotlib, and Seaborn):

  • Description: A versatile programming language with libraries that support data manipulation, analysis, and visualization.
  • Applications: Data cleaning, statistical analysis, machine learning, and data visualization.

4) SPSS (Statistical Package for the Social Sciences):

  • Description: A comprehensive statistical software suite used for data analysis, data mining, and predictive analytics.
  • Applications: Descriptive statistics, hypothesis testing, regression analysis, and advanced analytics.

5) SAS (Statistical Analysis System):

  • Description: A software suite used for advanced analytics, multivariate analysis, and predictive modeling.
  • Applications: Data management, statistical analysis, predictive modeling, and business intelligence.

6) Tableau:

  • Description: A data visualization tool that allows users to create interactive and shareable dashboards and reports.
  • Applications: Data visualization , business intelligence , and interactive dashboard creation.

7) Power BI:

  • Description: A business analytics tool developed by Microsoft that provides interactive visualizations and business intelligence capabilities.
  • Applications: Data visualization, business intelligence, reporting, and dashboard creation.

8) SQL (Structured Query Language) Databases (e.g., MySQL, PostgreSQL, Microsoft SQL Server):

  • Description: Database management systems that support data storage, retrieval, and manipulation using SQL queries.
  • Applications: Data retrieval, data cleaning, data transformation, and database management.

9) Apache Spark:

  • Description: A fast and general-purpose distributed computing system designed for big data processing and analytics.
  • Applications: Big data processing, machine learning, data streaming, and real-time analytics.

10) IBM SPSS Modeler:

  • Description: A data mining software application used for building predictive models and conducting advanced analytics.
  • Applications: Predictive modeling, data mining, statistical analysis, and decision optimization.

These tools serve various purposes and cater to different data analysis needs, from basic statistical analysis and data visualization to advanced analytics, machine learning, and big data processing. The choice of a specific tool often depends on the nature of the data, the complexity of the analysis, and the specific requirements of the project or organization.

Also Read: How to Analyze Survey Data: Methods & Examples

Importance of Data Analysis in Research

The importance of data analysis in research cannot be overstated; it serves as the backbone of any scientific investigation or study. Here are several key reasons why data analysis is crucial in the research process:

  • Data analysis helps ensure that the results obtained are valid and reliable. By systematically examining the data, researchers can identify any inconsistencies or anomalies that may affect the credibility of the findings.
  • Effective data analysis provides researchers with the necessary information to make informed decisions. By interpreting the collected data, researchers can draw conclusions, make predictions, or formulate recommendations based on evidence rather than intuition or guesswork.
  • Data analysis allows researchers to identify patterns, trends, and relationships within the data. This can lead to a deeper understanding of the research topic, enabling researchers to uncover insights that may not be immediately apparent.
  • In empirical research, data analysis plays a critical role in testing hypotheses. Researchers collect data to either support or refute their hypotheses, and data analysis provides the tools and techniques to evaluate these hypotheses rigorously.
  • Transparent and well-executed data analysis enhances the credibility of research findings. By clearly documenting the data analysis methods and procedures, researchers allow others to replicate the study, thereby contributing to the reproducibility of research findings.
  • In fields such as business or healthcare, data analysis helps organizations allocate resources more efficiently. By analyzing data on consumer behavior, market trends, or patient outcomes, organizations can make strategic decisions about resource allocation, budgeting, and planning.
  • In public policy and social sciences, data analysis is instrumental in developing and evaluating policies and interventions. By analyzing data on social, economic, or environmental factors, policymakers can assess the effectiveness of existing policies and inform the development of new ones.
  • Data analysis allows for continuous improvement in research methods and practices. By analyzing past research projects, identifying areas for improvement, and implementing changes based on data-driven insights, researchers can refine their approaches and enhance the quality of future research endeavors.

However, it is important to remember that mastering these techniques requires practice and continuous learning. That’s why we highly recommend the Data Analytics Course by Physics Wallah . Not only does it cover all the fundamentals of data analysis, but it also provides hands-on experience with various tools such as Excel, Python, and Tableau. Plus, if you use the “ READER ” coupon code at checkout, you can get a special discount on the course.

For Latest Tech Related Information, Join Our Official Free Telegram Group : PW Skills Telegram Group

Data Analysis Techniques in Research FAQs

What are the 5 techniques for data analysis.

The five techniques for data analysis include: Descriptive Analysis Diagnostic Analysis Predictive Analysis Prescriptive Analysis Qualitative Analysis

What are techniques of data analysis in research?

Techniques of data analysis in research encompass both qualitative and quantitative methods. These techniques involve processes like summarizing raw data, investigating causes of events, forecasting future outcomes, offering recommendations based on predictions, and examining non-numerical data to understand concepts or experiences.

What are the 3 methods of data analysis?

The three primary methods of data analysis are: Qualitative Analysis Quantitative Analysis Mixed-Methods Analysis

What are the four types of data analysis techniques?

The four types of data analysis techniques are: Descriptive Analysis Diagnostic Analysis Predictive Analysis Prescriptive Analysis

  • Healthcare Analytics: Definition, Impact, and More

Healthcare analytics

Healthcare analytics is transforming the way we approach patient care, administrative tasks, and overall healthcare strategies.

  • What are Analytical Insights: Definition & Best Practices

analytical insights

Analytical insights are important information or gains that can be extracted by interpreting and analysing data. Let us get more…

  • Python for Data Analysis

research data analysis tools

Python for data analysis opens up a realm of career opportunities, empowering individuals to unlock the true potential of data…

right adv

Related Articles

  • Top 25 Big Data Interview Questions and Answers
  • Graph Analytics – What Is it and Why Does It Matter?
  • 7 Best Courses on Data Analytics: Guide & Reviews
  • Which Course is Best for Business Analyst? (Business Analysts Online Courses)
  • Analysis of Algorithm in Data Structure
  • 5+ Best Data Analytics Certifications for Success in Your Career 2024!
  • AI and Predictive Analytics: Definition, Examples, Tools, Function, and More!

bottom banner

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

The Beginner's Guide to Statistical Analysis | 5 Steps & Examples

Statistical analysis means investigating trends, patterns, and relationships using quantitative data . It is an important research tool used by scientists, governments, businesses, and other organizations.

To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process . You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.

After collecting data from your sample, you can organize and summarize the data using descriptive statistics . Then, you can use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalize your findings.

This article is a practical introduction to statistical analysis for students and researchers. We’ll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables.

Table of contents

Step 1: write your hypotheses and plan your research design, step 2: collect data from a sample, step 3: summarize your data with descriptive statistics, step 4: test hypotheses or make estimates with inferential statistics, step 5: interpret your results, other interesting articles.

To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design.

Writing statistical hypotheses

The goal of research is often to investigate a relationship between variables within a population . You start with a prediction, and use statistical analysis to test that prediction.

A statistical hypothesis is a formal way of writing a prediction about a population. Every research prediction is rephrased into null and alternative hypotheses that can be tested using sample data.

While the null hypothesis always predicts no effect or no relationship between variables, the alternative hypothesis states your research prediction of an effect or relationship.

  • Null hypothesis: A 5-minute meditation exercise will have no effect on math test scores in teenagers.
  • Alternative hypothesis: A 5-minute meditation exercise will improve math test scores in teenagers.
  • Null hypothesis: Parental income and GPA have no relationship with each other in college students.
  • Alternative hypothesis: Parental income and GPA are positively correlated in college students.

Planning your research design

A research design is your overall strategy for data collection and analysis. It determines the statistical tests you can use to test your hypothesis later on.

First, decide whether your research will use a descriptive, correlational, or experimental design. Experiments directly influence variables, whereas descriptive and correlational studies only measure variables.

  • In an experimental design , you can assess a cause-and-effect relationship (e.g., the effect of meditation on test scores) using statistical tests of comparison or regression.
  • In a correlational design , you can explore relationships between variables (e.g., parental income and GPA) without any assumption of causality using correlation coefficients and significance tests.
  • In a descriptive design , you can study the characteristics of a population or phenomenon (e.g., the prevalence of anxiety in U.S. college students) using statistical tests to draw inferences from sample data.

Your research design also concerns whether you’ll compare participants at the group level or individual level, or both.

  • In a between-subjects design , you compare the group-level outcomes of participants who have been exposed to different treatments (e.g., those who performed a meditation exercise vs those who didn’t).
  • In a within-subjects design , you compare repeated measures from participants who have participated in all treatments of a study (e.g., scores from before and after performing a meditation exercise).
  • In a mixed (factorial) design , one variable is altered between subjects and another is altered within subjects (e.g., pretest and posttest scores from participants who either did or didn’t do a meditation exercise).
  • Experimental
  • Correlational

First, you’ll take baseline test scores from participants. Then, your participants will undergo a 5-minute meditation exercise. Finally, you’ll record participants’ scores from a second math test.

In this experiment, the independent variable is the 5-minute meditation exercise, and the dependent variable is the math test score from before and after the intervention. Example: Correlational research design In a correlational study, you test whether there is a relationship between parental income and GPA in graduating college students. To collect your data, you will ask participants to fill in a survey and self-report their parents’ incomes and their own GPA.

Measuring variables

When planning a research design, you should operationalize your variables and decide exactly how you will measure them.

For statistical analysis, it’s important to consider the level of measurement of your variables, which tells you what kind of data they contain:

  • Categorical data represents groupings. These may be nominal (e.g., gender) or ordinal (e.g. level of language ability).
  • Quantitative data represents amounts. These may be on an interval scale (e.g. test score) or a ratio scale (e.g. age).

Many variables can be measured at different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (e.g., level of agreement from 1–5), it doesn’t automatically mean that it’s quantitative instead of categorical.

Identifying the measurement level is important for choosing appropriate statistics and hypothesis tests. For example, you can calculate a mean score with quantitative data, but not with categorical data.

In a research study, along with measures of your variables of interest, you’ll often collect data on relevant participant characteristics.

Variable Type of data
Age Quantitative (ratio)
Gender Categorical (nominal)
Race or ethnicity Categorical (nominal)
Baseline test scores Quantitative (interval)
Final test scores Quantitative (interval)
Parental income Quantitative (ratio)
GPA Quantitative (interval)

Prevent plagiarism. Run a free check.

Population vs sample

In most cases, it’s too difficult or expensive to collect data from every member of the population you’re interested in studying. Instead, you’ll collect data from a sample.

Statistical analysis allows you to apply your findings beyond your own sample as long as you use appropriate sampling procedures . You should aim for a sample that is representative of the population.

Sampling for statistical analysis

There are two main approaches to selecting a sample.

  • Probability sampling: every member of the population has a chance of being selected for the study through random selection.
  • Non-probability sampling: some members of the population are more likely than others to be selected for the study because of criteria such as convenience or voluntary self-selection.

In theory, for highly generalizable findings, you should use a probability sampling method. Random selection reduces several types of research bias , like sampling bias , and ensures that data from your sample is actually typical of the population. Parametric tests can be used to make strong statistical inferences when data are collected using probability sampling.

But in practice, it’s rarely possible to gather the ideal sample. While non-probability samples are more likely to at risk for biases like self-selection bias , they are much easier to recruit and collect data from. Non-parametric tests are more appropriate for non-probability samples, but they result in weaker inferences about the population.

If you want to use parametric tests for non-probability samples, you have to make the case that:

  • your sample is representative of the population you’re generalizing your findings to.
  • your sample lacks systematic bias.

Keep in mind that external validity means that you can only generalize your conclusions to others who share the characteristics of your sample. For instance, results from Western, Educated, Industrialized, Rich and Democratic samples (e.g., college students in the US) aren’t automatically applicable to all non-WEIRD populations.

If you apply parametric tests to data from non-probability samples, be sure to elaborate on the limitations of how far your results can be generalized in your discussion section .

Create an appropriate sampling procedure

Based on the resources available for your research, decide on how you’ll recruit participants.

  • Will you have resources to advertise your study widely, including outside of your university setting?
  • Will you have the means to recruit a diverse sample that represents a broad population?
  • Do you have time to contact and follow up with members of hard-to-reach groups?

Your participants are self-selected by their schools. Although you’re using a non-probability sample, you aim for a diverse and representative sample. Example: Sampling (correlational study) Your main population of interest is male college students in the US. Using social media advertising, you recruit senior-year male college students from a smaller subpopulation: seven universities in the Boston area.

Calculate sufficient sample size

Before recruiting participants, decide on your sample size either by looking at other studies in your field or using statistics. A sample that’s too small may be unrepresentative of the sample, while a sample that’s too large will be more costly than necessary.

There are many sample size calculators online. Different formulas are used depending on whether you have subgroups or how rigorous your study should be (e.g., in clinical research). As a rule of thumb, a minimum of 30 units or more per subgroup is necessary.

To use these calculators, you have to understand and input these key components:

  • Significance level (alpha): the risk of rejecting a true null hypothesis that you are willing to take, usually set at 5%.
  • Statistical power : the probability of your study detecting an effect of a certain size if there is one, usually 80% or higher.
  • Expected effect size : a standardized indication of how large the expected result of your study will be, usually based on other similar studies.
  • Population standard deviation: an estimate of the population parameter based on a previous study or a pilot study of your own.

Once you’ve collected all of your data, you can inspect them and calculate descriptive statistics that summarize them.

Inspect your data

There are various ways to inspect your data, including the following:

  • Organizing data from each variable in frequency distribution tables .
  • Displaying data from a key variable in a bar chart to view the distribution of responses.
  • Visualizing the relationship between two variables using a scatter plot .

By visualizing your data in tables and graphs, you can assess whether your data follow a skewed or normal distribution and whether there are any outliers or missing data.

A normal distribution means that your data are symmetrically distributed around a center where most values lie, with the values tapering off at the tail ends.

Mean, median, mode, and standard deviation in a normal distribution

In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions.

Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.

Calculate measures of central tendency

Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:

  • Mode : the most popular response or value in the data set.
  • Median : the value in the exact middle of the data set when ordered from low to high.
  • Mean : the sum of all values divided by the number of values.

However, depending on the shape of the distribution and level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using the mode or proportions, while a variable like reaction time may not have a mode at all.

Calculate measures of variability

Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:

  • Range : the highest value minus the lowest value of the data set.
  • Interquartile range : the range of the middle half of the data set.
  • Standard deviation : the average distance between each value in your data set and the mean.
  • Variance : the square of the standard deviation.

Once again, the shape of the distribution and level of measurement should guide your choice of variability statistics. The interquartile range is the best measure for skewed distributions, while standard deviation and variance provide the best information for normal distributions.

Using your table, you should check whether the units of the descriptive statistics are comparable for pretest and posttest scores. For example, are the variance levels similar across the groups? Are there any extreme values? If there are, you may need to identify and remove extreme outliers in your data set or transform your data before performing a statistical test.

Pretest scores Posttest scores
Mean 68.44 75.25
Standard deviation 9.43 9.88
Variance 88.96 97.96
Range 36.25 45.12
30

From this table, we can see that the mean score increased after the meditation exercise, and the variances of the two scores are comparable. Next, we can perform a statistical test to find out if this improvement in test scores is statistically significant in the population. Example: Descriptive statistics (correlational study) After collecting data from 653 students, you tabulate descriptive statistics for annual parental income and GPA.

It’s important to check whether you have a broad range of data points. If you don’t, your data may be skewed towards some groups more than others (e.g., high academic achievers), and only limited inferences can be made about a relationship.

Parental income (USD) GPA
Mean 62,100 3.12
Standard deviation 15,000 0.45
Variance 225,000,000 0.16
Range 8,000–378,000 2.64–4.00
653

A number that describes a sample is called a statistic , while a number describing a population is called a parameter . Using inferential statistics , you can make conclusions about population parameters based on sample statistics.

Researchers often use two main methods (simultaneously) to make inferences in statistics.

  • Estimation: calculating population parameters based on sample statistics.
  • Hypothesis testing: a formal process for testing research predictions about the population using samples.

You can make two types of estimates of population parameters from sample statistics:

  • A point estimate : a value that represents your best guess of the exact parameter.
  • An interval estimate : a range of values that represent your best guess of where the parameter lies.

If your aim is to infer and report population characteristics from sample data, it’s best to use both point and interval estimates in your paper.

You can consider a sample statistic a point estimate for the population parameter when you have a representative sample (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).

There’s always error involved in estimation, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.

A confidence interval uses the standard error and the z score from the standard normal distribution to convey where you’d generally expect to find the population parameter most of the time.

Hypothesis testing

Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not.

Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true. These tests give two main outputs:

  • A test statistic tells you how much your data differs from the null hypothesis of the test.
  • A p value tells you the likelihood of obtaining your results if the null hypothesis is actually true in the population.

Statistical tests come in three main varieties:

  • Comparison tests assess group differences in outcomes.
  • Regression tests assess cause-and-effect relationships between variables.
  • Correlation tests assess relationships between variables without assuming causation.

Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics.

Parametric tests

Parametric tests make powerful inferences about the population based on sample data. But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead.

A regression models the extent to which changes in a predictor variable results in changes in outcome variable(s).

  • A simple linear regression includes one predictor variable and one outcome variable.
  • A multiple linear regression includes two or more predictor variables and one outcome variable.

Comparison tests usually compare the means of groups. These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a sample mean and a population mean.

  • A t test is for exactly 1 or 2 groups when the sample is small (30 or less).
  • A z test is for exactly 1 or 2 groups when the sample is large.
  • An ANOVA is for 3 or more groups.

The z and t tests have subtypes based on the number and types of samples and the hypotheses:

  • If you have only one sample that you want to compare to a population mean, use a one-sample test .
  • If you have paired measurements (within-subjects design), use a dependent (paired) samples test .
  • If you have completely separate measurements from two unmatched groups (between-subjects design), use an independent (unpaired) samples test .
  • If you expect a difference between groups in a specific direction, use a one-tailed test .
  • If you don’t have any expectations for the direction of a difference between groups, use a two-tailed test .

The only parametric correlation test is Pearson’s r . The correlation coefficient ( r ) tells you the strength of a linear relationship between two quantitative variables.

However, to test whether the correlation in the sample is strong enough to be important in the population, you also need to perform a significance test of the correlation coefficient, usually a t test, to obtain a p value. This test uses your sample size to calculate how much the correlation coefficient differs from zero in the population.

You use a dependent-samples, one-tailed t test to assess whether the meditation exercise significantly improved math test scores. The test gives you:

  • a t value (test statistic) of 3.00
  • a p value of 0.0028

Although Pearson’s r is a test statistic, it doesn’t tell you anything about how significant the correlation is in the population. You also need to test whether this sample correlation coefficient is large enough to demonstrate a correlation in the population.

A t test can also determine how significantly a correlation coefficient differs from zero based on sample size. Since you expect a positive correlation between parental income and GPA, you use a one-sample, one-tailed t test. The t test gives you:

  • a t value of 3.08
  • a p value of 0.001

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

research data analysis tools

The final step of statistical analysis is interpreting your results.

Statistical significance

In hypothesis testing, statistical significance is the main criterion for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.

Statistically significant results are considered unlikely to have arisen solely due to chance. There is only a very low chance of such a result occurring if the null hypothesis is true in the population.

This means that you believe the meditation intervention, rather than random factors, directly caused the increase in test scores. Example: Interpret your results (correlational study) You compare your p value of 0.001 to your significance threshold of 0.05. With a p value under this threshold, you can reject the null hypothesis. This indicates a statistically significant correlation between parental income and GPA in male college students.

Note that correlation doesn’t always mean causation, because there are often many underlying factors contributing to a complex variable like GPA. Even if one variable is related to another, this may be because of a third variable influencing both of them, or indirect links between the two variables.

Effect size

A statistically significant result doesn’t necessarily mean that there are important real life applications or clinical outcomes for a finding.

In contrast, the effect size indicates the practical significance of your results. It’s important to report effect sizes along with your inferential statistics for a complete picture of your results. You should also report interval estimates of effect sizes if you’re writing an APA style paper .

With a Cohen’s d of 0.72, there’s medium to high practical significance to your finding that the meditation exercise improved test scores. Example: Effect size (correlational study) To determine the effect size of the correlation coefficient, you compare your Pearson’s r value to Cohen’s effect size criteria.

Decision errors

Type I and Type II errors are mistakes made in research conclusions. A Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s false.

You can aim to minimize the risk of these errors by selecting an optimal significance level and ensuring high power . However, there’s a trade-off between the two errors, so a fine balance is necessary.

Frequentist versus Bayesian statistics

Traditionally, frequentist statistics emphasizes null hypothesis significance testing and always starts with the assumption of a true null hypothesis.

However, Bayesian statistics has grown in popularity as an alternative approach in the last few decades. In this approach, you use previous research to continually update your hypotheses based on your expectations and observations.

Bayes factor compares the relative strength of evidence for the null versus the alternative hypothesis rather than making a conclusion about rejecting the null hypothesis or not.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval

Methodology

  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hostile attribution bias
  • Affect heuristic

Is this article helpful?

Other students also liked.

  • Descriptive Statistics | Definitions, Types, Examples
  • Inferential Statistics | An Easy Introduction & Examples
  • Choosing the Right Statistical Test | Types & Examples

More interesting articles

  • Akaike Information Criterion | When & How to Use It (Example)
  • An Easy Introduction to Statistical Significance (With Examples)
  • An Introduction to t Tests | Definitions, Formula and Examples
  • ANOVA in R | A Complete Step-by-Step Guide with Examples
  • Central Limit Theorem | Formula, Definition & Examples
  • Central Tendency | Understanding the Mean, Median & Mode
  • Chi-Square (Χ²) Distributions | Definition & Examples
  • Chi-Square (Χ²) Table | Examples & Downloadable Table
  • Chi-Square (Χ²) Tests | Types, Formula & Examples
  • Chi-Square Goodness of Fit Test | Formula, Guide & Examples
  • Chi-Square Test of Independence | Formula, Guide & Examples
  • Coefficient of Determination (R²) | Calculation & Interpretation
  • Correlation Coefficient | Types, Formulas & Examples
  • Frequency Distribution | Tables, Types & Examples
  • How to Calculate Standard Deviation (Guide) | Calculator & Examples
  • How to Calculate Variance | Calculator, Analysis & Examples
  • How to Find Degrees of Freedom | Definition & Formula
  • How to Find Interquartile Range (IQR) | Calculator & Examples
  • How to Find Outliers | 4 Ways with Examples & Explanation
  • How to Find the Geometric Mean | Calculator & Formula
  • How to Find the Mean | Definition, Examples & Calculator
  • How to Find the Median | Definition, Examples & Calculator
  • How to Find the Mode | Definition, Examples & Calculator
  • How to Find the Range of a Data Set | Calculator & Formula
  • Hypothesis Testing | A Step-by-Step Guide with Easy Examples
  • Interval Data and How to Analyze It | Definitions & Examples
  • Levels of Measurement | Nominal, Ordinal, Interval and Ratio
  • Linear Regression in R | A Step-by-Step Guide & Examples
  • Missing Data | Types, Explanation, & Imputation
  • Multiple Linear Regression | A Quick Guide (Examples)
  • Nominal Data | Definition, Examples, Data Collection & Analysis
  • Normal Distribution | Examples, Formulas, & Uses
  • Null and Alternative Hypotheses | Definitions & Examples
  • One-way ANOVA | When and How to Use It (With Examples)
  • Ordinal Data | Definition, Examples, Data Collection & Analysis
  • Parameter vs Statistic | Definitions, Differences & Examples
  • Pearson Correlation Coefficient (r) | Guide & Examples
  • Poisson Distributions | Definition, Formula & Examples
  • Probability Distribution | Formula, Types, & Examples
  • Quartiles & Quantiles | Calculation, Definition & Interpretation
  • Ratio Scales | Definition, Examples, & Data Analysis
  • Simple Linear Regression | An Easy Introduction & Examples
  • Skewness | Definition, Examples & Formula
  • Statistical Power and Why It Matters | A Simple Introduction
  • Student's t Table (Free Download) | Guide & Examples
  • T-distribution: What it is and how to use it
  • Test statistics | Definition, Interpretation, and Examples
  • The Standard Normal Distribution | Calculator, Examples & Uses
  • Two-Way ANOVA | Examples & When To Use It
  • Type I & Type II Errors | Differences, Examples, Visualizations
  • Understanding Confidence Intervals | Easy Examples & Formulas
  • Understanding P values | Definition and Examples
  • Variability | Calculating Range, IQR, Variance, Standard Deviation
  • What is Effect Size and Why Does It Matter? (Examples)
  • What Is Kurtosis? | Definition, Examples & Formula
  • What Is Standard Error? | How to Calculate (Guide with Examples)

What is your plagiarism score?

facebook

  • Hire a PhD Guide
  • Guidance Process
  • PhD Topic and Proposal Help
  • PhD Thesis Chapters Writing
  • PhD Literature Review Writing Help
  • PhD Research Methodology Chapter Help
  • Questionnaire Design for PhD Research
  • PhD Statistical Analysis Help
  • Qualitative Analysis Help for PhD Research
  • Software Implementation Help for PhD Projects
  • Journal Paper Publication Assistance
  • Addressing Comments, Revisions in PhD Thesis
  • Enhance the Quality of Your PhD Thesis with Professional Thesis Editing Services
  • PhD Thesis Defence Preparation

image

Ethical research guidance and consulting services for PhD candidates since 2008

Topic selection & proposal development, enquire now, software implementation using matlab, questionnaire designing & data analysis, chapters writing & journal papers, 12 unexplored data analysis tools for qualitative research.

Data analysis tools for qualitative research

Welcome to our guide on 5 lesser-known tools for studying information in a different way – specifically designed for understanding and interpreting data in qualitative research. Data analysis tools for qualitative research are specialized instruments designed to interpret non-numerical data, offering insights into patterns, themes, and relationships.

These tools enable researchers to uncover meaning from qualitative information, enhancing the depth and understanding of complex phenomena in fields such as social sciences, psychology, and humanities.

In the world of research, there are tools tailored for qualitative data analysis that can reveal hidden insights. This blog explores these tools, showcasing their unique features and advantages compared to the more commonly used quantitative analysis tools.

Whether you’re a seasoned researcher or just starting out, we aim to make these tools accessible and highlight how they can add depth and accuracy to your analysis. Join us as we uncover these innovative approaches, offering practical solutions to enhance your experience with qualitative research.

Tool 1:MAXQDA Analytics Pro

Data analysis tools MAXQDA Analytics Pro

MAXQDA Analytics Pro emerges as a game-changing tool for qualitative data analysis, offering a seamless experience that goes beyond the capabilities of traditional quantitative tools.

Here’s how MAXQDA stands out in the world of qualitative research:

Advanced Coding and Text Analysis: MAXQDA empowers researchers with advanced coding features and text analysis tools, enabling the exploration of qualitative data with unprecedented depth. Its intuitive interface allows for efficient categorization and interpretation of textual information.

Intuitive Interface for Effortless Exploration: The user-friendly design of MAXQDA makes it accessible for researchers of all levels. This tool streamlines the process of exploring qualitative data, facilitating a more efficient and insightful analysis compared to traditional quantitative tools.

Uncovering Hidden Narratives: MAXQDA excels in revealing hidden narratives within qualitative data, allowing researchers to identify patterns, themes, and relationships that might be overlooked by conventional quantitative approaches. This capability adds a valuable layer to the analysis of complex phenomena.

In the landscape of qualitative data analysis tools, MAXQDA Analytics Pro is a valuable asset, providing researchers with a unique set of features that enhance the depth and precision of their analysis. Its contribution extends beyond the confines of quantitative analysis tools, making it an indispensable tool for those seeking innovative approaches to qualitative research.

Tool 2: Quirkos

Data analysis tool Quirkos

Quirkos , positioned as data analysis software, shines as a transformative tool within the world of qualitative research.

Here’s why Quirkos is considered among the best for quality data analysis: Visual Approach for Enhanced Understanding: Quirkos introduces a visual approach, setting it apart from conventional analysis software. This unique feature aids researchers in easily grasping and interpreting qualitative data, promoting a more comprehensive understanding of complex information.

User-Friendly Interface: One of Quirkos’ standout features is its user-friendly interface. This makes it accessible to researchers of various skill levels, ensuring that the tool’s benefits are not limited to experienced users. Its simplicity adds to the appeal for those seeking the best quality data analysis software.

Effortless Pattern Identification: Quirkos simplifies the process of identifying patterns within qualitative data. This capability is crucial for researchers aiming to conduct in-depth analysis efficiently.

The tool’s intuitive design fosters a seamless exploration of data, making it an indispensable asset in the world of analysis software. Quirkos, recognized among the best quality data analysis software, offers a visual and user-friendly approach to qualitative research. Its ability to facilitate effortless pattern identification positions it as a valuable asset for researchers seeking optimal outcomes in their data analysis endeavors.

Tool 3: Provalis Research WordStat

Data analysis tool NVivo Transcription

Provalis Research WordStat stands out as a powerful tool within the world of qualitative data analysis tools, offering unique advantages for researchers engaged in qualitative analysis:

WordStat excels in text mining, providing researchers with a robust platform to delve into vast amounts of textual data. This capability enhances the depth of qualitative analysis, setting it apart in the landscape of tools for qualitative research.

Specializing in content analysis, WordStat facilitates the systematic examination of textual information. Researchers can uncover themes, trends, and patterns within qualitative data, contributing to a more comprehensive understanding of complex phenomena.

WordStat seamlessly integrates with qualitative research methodologies, providing a bridge between quantitative and qualitative analysis. This integration allows researchers to harness the strengths of both approaches, expanding the possibilities for nuanced insights.

In the domain of tools for qualitative research, Provalis Research WordStat emerges as a valuable asset. Its text mining capabilities, content analysis expertise, and integration with qualitative research methodologies collectively contribute to elevating the qualitative analysis experience for researchers.

Tool 4: ATLAS.ti

Data analysis tool ATLAS.Ti

ATLAS.ti proves to be a cornerstone in the world of qualitative data analysis tools, offering distinctive advantages that enhance the qualitative analysis process:

Multi-Faceted Data Exploration: ATLAS.ti facilitates in-depth exploration of textual, graphical, and multimedia data. This versatility enables researchers to engage with diverse types of qualitative information, broadening the scope of analysis beyond traditional boundaries.

Collaboration and Project Management: The tool excels in fostering collaboration among researchers and project management. This collaborative aspect sets ATLAS.ti apart, making it a comprehensive solution for teams engaged in qualitative research endeavors.

User-Friendly Interface: ATLAS.ti provides a user-friendly interface, ensuring accessibility for researchers of various skill levels. This simplicity in navigation enhances the overall qualitative analysis experience, making it an effective tool for both seasoned researchers and those new to data analysis tools. In the landscape of tools for qualitative research, ATLAS.ti emerges as a valuable ally. Its multi-faceted data exploration, collaboration features, and user-friendly interface collectively contribute to enriching the qualitative analysis journey for researchers seeking a comprehensive and efficient solution.

Tool 5: NVivo Transcription

Data analysis tool NVivo Transcription

NVivo Transcription emerges as a valuable asset in the world of data analysis tools, seamlessly integrating transcription services with qualitative research methodologies:

Efficient Transcription Services: NVivo Transcription offers efficient and accurate transcription services, streamlining the process of converting spoken words into written text. This feature is essential for researchers engaged in qualitative analysis, ensuring a solid foundation for subsequent exploration.

Integration with NVivo Software: The tool seamlessly integrates with NVivo software, creating a synergistic relationship between transcription and qualitative analysis. Researchers benefit from a unified platform that simplifies the organization and analysis of qualitative data, enhancing the overall research workflow.

Comprehensive Qualitative Analysis: NVivo Transcription contributes to comprehensive qualitative analysis by providing a robust foundation for understanding and interpreting audio and video data. Researchers can uncover valuable insights within the transcribed content, enriching the qualitative analysis process.

In the landscape of tools for qualitative research, NVivo Transcription plays a crucial role in bridging the gap between transcription services and qualitative analysis. Its efficient transcription capabilities, integration with NVivo software, and support for comprehensive qualitative analysis make it a valuable tool for researchers seeking a streamlined and effective approach to handling qualitative data.

Tool 6: Dedoose

Web-Based Accessibility: Dedoose’s online platform allows PhD researchers to conduct qualitative data analysis from anywhere, promoting flexibility and collaboration.

Mixed-Methods Support: Dedoose accommodates mixed-methods research, enabling the integration of both quantitative and qualitative data for a comprehensive analysis.

Multi-Media Compatibility: The tool supports various data formats, including text, audio, and video, facilitating the analysis of diverse qualitative data types.

Collaborative Features: Dedoose fosters collaboration among researchers, providing tools for shared coding, annotation, and exploration of qualitative data.

Organized Data Management: PhD researchers benefit from Dedoose’s organizational features, streamlining the coding and retrieval of data for a more efficient analysis process.

Tool 7: HyperRESEARCH

HyperRESEARCH caters to various qualitative research methods, including content analysis and grounded theory, offering a flexible platform for PhD researchers.

The software simplifies the coding and retrieval of data, aiding researchers in organizing and analyzing qualitative information systematically.

HyperRESEARCH allows for detailed annotation of text, enhancing the depth of qualitative analysis and providing a comprehensive understanding of the data.

The tool provides features for visualizing relationships within data, aiding researchers in uncovering patterns and connections in qualitative content.

HyperRESEARCH facilitates collaborative research efforts, promoting teamwork and shared insights among PhD researchers.

Tool 8: MAXQDA Analytics Plus

Advanced Collaboration:  

MAXQDA Analytics Plus enhances collaboration for PhD researchers with teamwork support, enabling multiple researchers to work seamlessly on qualitative data analysis.

Extended Visualization Tools:  

The software offers advanced data visualization features, allowing researchers to create visual representations of qualitative data patterns for a more comprehensive understanding.

Efficient Workflow:  

MAXQDA Analytics Plus streamlines the qualitative analysis workflow, providing tools that facilitate efficient coding, categorization, and interpretation of complex textual information.

Deeper Insight Integration:  

Building upon MAXQDA Analytics Pro, MAXQDA Analytics Plus integrates additional features for a more nuanced qualitative analysis, empowering PhD researchers to gain deeper insights into their research data.

User-Friendly Interface:  

The tool maintains a user-friendly interface, ensuring accessibility for researchers of various skill levels, contributing to an effective and efficient data analysis experience.

Tool 9: QDA Miner

Versatile Data Analysis: QDA Miner supports a wide range of qualitative research methodologies, accommodating diverse data types, including text, images, and multimedia, catering to the varied needs of PhD researchers.

Coding and Annotation Tools: The software provides robust coding and annotation features, facilitating a systematic organization and analysis of qualitative data for in-depth exploration.

Visual Data Exploration: QDA Miner includes visualization tools for researchers to analyze data patterns visually, aiding in the identification of themes and relationships within qualitative content.

User-Friendly Interface: With a user-friendly interface, QDA Miner ensures accessibility for researchers at different skill levels, contributing to a seamless and efficient qualitative data analysis experience.

Comprehensive Analysis Support: QDA Miner’s features contribute to a comprehensive analysis, offering PhD researchers a tool that integrates seamlessly into their qualitative research endeavors.

Tool 10: NVivo

NVivo supports diverse qualitative research methodologies, allowing PhD researchers to analyze text, images, audio, and video data for a comprehensive understanding.

The software aids researchers in organizing and categorizing qualitative data systematically, streamlining the coding and analysis process.

NVivo seamlessly integrates with various data formats, providing a unified platform for transcription services and qualitative analysis, simplifying the overall research workflow.

NVivo offers tools for visual representation, enabling researchers to create visual models that enhance the interpretation of qualitative data patterns and relationships.

NVivo Transcription integration ensures efficient handling of audio and video data, offering PhD researchers a comprehensive solution for qualitative data analysis.

Tool 11: Weft QDA

Open-Source Affordability: Weft QDA’s open-source nature makes it an affordable option for PhD researchers on a budget, providing cost-effective access to qualitative data analysis tools.

Simplicity for Beginners: With a straightforward interface, Weft QDA is user-friendly and ideal for researchers new to qualitative data analysis, offering basic coding and text analysis features.

Ease of Use: The tool simplifies the process of coding and analyzing qualitative data, making it accessible to researchers of varying skill levels and ensuring a smooth and efficient analysis experience.

Entry-Level Solution: Weft QDA serves as a suitable entry-level option, introducing PhD researchers to the fundamentals of qualitative data analysis without overwhelming complexity.

Basic Coding Features: While being simple, Weft QDA provides essential coding features, enabling researchers to organize and explore qualitative data effectively.

Tool 12: Transana

Transana specializes in the analysis of audio and video data, making it a valuable tool for PhD researchers engaged in qualitative studies with rich multimedia content.

The software streamlines the transcription process, aiding researchers in converting spoken words into written text, providing a foundation for subsequent qualitative analysis.

Transana allows for in-depth exploration of multimedia data, facilitating coding and analysis of visual and auditory aspects crucial to certain qualitative research projects.

With tools for transcribing and coding, Transana assists PhD researchers in organizing and categorizing qualitative data, promoting a structured and systematic approach to analysis.

Researchers benefit from Transana’s capabilities to uncover valuable insights within transcribed content, enriching the qualitative analysis process with a focus on visual and auditory dimensions.

Final Thoughts

In wrapping up our journey through 5 lesser-known data analysis tools for qualitative research, it’s clear these tools bring a breath of fresh air to the world of analysis. MAXQDA Analytics Pro, Quirkos, Provalis Research WordStat, ATLAS.ti, and NVivo Transcription each offer something unique, steering away from the usual quantitative analysis tools.

They go beyond, with MAXQDA’s advanced coding, Quirkos’ visual approach, WordStat’s text mining, ATLAS.ti’s multi-faceted data exploration, and NVivo Transcription’s seamless integration.

These tools aren’t just alternatives; they are untapped resources for qualitative research. As we bid adieu to the traditional quantitative tools, these unexplored gems beckon researchers to a world where hidden narratives and patterns are waiting to be discovered.

They don’t just add to the toolbox; they redefine how we approach and understand complex phenomena. In a world where research is evolving rapidly, these tools for qualitative research stand out as beacons of innovation and efficiency.

PhDGuidance is a website that provides customized solutions for PhD researchers in the field of qualitative analysis. They offer comprehensive guidance for research topics, thesis writing, and publishing. Their team of expert consultants helps researchers conduct copious research in areas such as social sciences, humanities, and more, aiming to provide a comprehensive understanding of the research problem.

PhDGuidance offers qualitative data analysis services to help researchers study the behavior of participants and observe them to analyze for the research work. They provide both manual thematic analysis and using NVivo for data collection. They also offer customized solutions for research design, data collection, literature review, language correction, analytical tools, and techniques for both qualitative and quantitative research projects.

Frequently Asked Questions

  • What is the best free qualitative data analysis software?

When it comes to free qualitative data analysis software, one standout option is RQDA. RQDA, an open-source tool, provides a user-friendly platform for coding and analyzing textual data. Its compatibility with R, a statistical computing language, adds a layer of flexibility for those familiar with programming. Another notable mention is QDA Miner Lite, offering basic qualitative analysis features at no cost. While these free tools may not match the advanced capabilities of premium software, they serve as excellent starting points for individuals or small projects with budget constraints.

2. Which software is used to Analyse qualitative data?

For a more comprehensive qualitative data analysis experience, many researchers turn to premium tools like NVivo, MAXQDA, or ATLAS.ti. NVivo, in particular, stands out due to its user-friendly interface, robust coding capabilities, and integration with various data types, including audio and visual content. MAXQDA and ATLAS.ti also offer advanced features for qualitative data analysis, providing researchers with tools to explore, code, and interpret complex qualitative information effectively.

3. How can I Analyse my qualitative data?

Analyzing qualitative data involves a systematic approach to make sense of textual, visual, or audio information. Here’s a general guide:

Data Familiarization: Understand the context and content of your data through thorough reading or viewing.

Open Coding: Begin with open coding, identifying and labeling key concepts without preconceived categories.

Axial Coding: Organize codes into broader categories, establishing connections and relationships between them.

Selective Coding: Focus on the most significant codes, creating a narrative that tells the story of your data.

Constant Comparison: Continuously compare new data with existing codes to refine categories and ensure consistency.

Use of Software: Employ qualitative data analysis software, such as NVivo or MAXQDA, to facilitate coding, organization, and interpretation.

4. Is it worth using NVivo for qualitative data analysis?

The use of NVivo for qualitative data analysis depends on the specific needs of the researcher and the scale of the project. NVivo is worth considering for its versatility, user-friendly interface, and ability to handle diverse data types. It streamlines the coding process, facilitates collaboration, and offers in-depth analytical tools. However, its cost may be a consideration for individuals or smaller research projects. Researchers with complex data sets, especially those involving multimedia content, may find NVivo’s advanced features justify the investment.

5. What are the tools used in quantitative data analysis?

Quantitative data analysis relies on tools specifically designed to handle numerical data. Some widely used tools include:

SPSS (Statistical Package for the Social Sciences): A statistical software suite that facilitates data analysis through descriptive statistics, regression analysis, and more. Excel: Widely used for basic quantitative analysis, offering functions for calculations, charts, and statistical analysis.

R and RStudio: An open-source programming language and integrated development environment used for statistical computing and graphics.

Python with Pandas and NumPy: Python is a versatile programming language, and Pandas and NumPy are libraries that provide powerful tools for data manipulation and analysis.

STATA: A software suite for data management and statistical analysis, widely used in various fields.

Hence, the choice of qualitative data analysis software depends on factors like project scale, budget, and specific requirements. Free tools like RQDA and QDA Miner Lite offer viable options for smaller projects, while premium software such as NVivo, MAXQDA, and ATLAS.ti provide advanced features for more extensive research endeavors. When it comes to quantitative data analysis, SPSS, Excel, R, Python, and STATA are among the widely used tools, each offering unique strengths for numerical data interpretation. Ultimately, the selection should align with the researcher’s goals and the nature of the data being analyzed.

Recent Posts

  • What Guides Your Research: Understanding Hypothesis v/s Research Questions Hypothesis , PhD Research May 29, 2024
  • How to Choose Well Matched Research Methodologies in PhD in 2024 – 25 Research Methodology January 16, 2024
  • 5 Different Types of Research Methodology for 2024 PhD Research January 9, 2024
  • 12 UNEXPLORED Data Analysis Tools for Qualitative Research Qualitative Analysis January 4, 2024
  • Separating Myth from Reality: The Scientific Rigor of Qualitative Research Topic and Proposal March 7, 2023
  • Data Analysis
  • PhD Research
  • Qualitative Analysis
  • Research Methodology
  • Topic and Proposal

REQUEST CALL BACK

Quick links.

  • PhD Guidance Maharashtra Trail
  • Synopsis and Thesis Assistance
  • Privacy Policy
  • Terms of use
  • Schedule Your Consultation Now
  • Grievance Redressal

Information

  • Geo Polymer for road construction
  • Machine Learning for Image processing applications
  • IoT and automation
  • Concrete strength with changing flyash percentage
  • Purchase regret prediction with Deep Learning
  • Low Power VLSI
  • Antenna design using HFSS
  • PhD Planner

CONTACT DETAILS

  • 022 4896 4199 (20 Lines)
  • 0091 93102 29971
  • [email protected]
  • Copyright © 2008-2024 PhD Guidance All Rights Reserved.

image

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Anaesth
  • v.60(9); 2016 Sep

Basic statistical tools in research and data analysis

Zulfiqar ali.

Department of Anaesthesiology, Division of Neuroanaesthesiology, Sheri Kashmir Institute of Medical Sciences, Soura, Srinagar, Jammu and Kashmir, India

S Bala Bhaskar

1 Department of Anaesthesiology and Critical Care, Vijayanagar Institute of Medical Sciences, Bellary, Karnataka, India

Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies. The article covers a brief outline of the variables, an understanding of quantitative and qualitative variables and the measures of central tendency. An idea of the sample size estimation, power analysis and the statistical errors is given. Finally, there is a summary of parametric and non-parametric tests used for data analysis.

INTRODUCTION

Statistics is a branch of science that deals with the collection, organisation, analysis of data and drawing of inferences from the samples to the whole population.[ 1 ] This requires a proper design of the study, an appropriate selection of the study sample and choice of a suitable statistical test. An adequate knowledge of statistics is necessary for proper designing of an epidemiological study or a clinical trial. Improper statistical methods may result in erroneous conclusions which may lead to unethical practice.[ 2 ]

Variable is a characteristic that varies from one individual member of population to another individual.[ 3 ] Variables such as height and weight are measured by some type of scale, convey quantitative information and are called as quantitative variables. Sex and eye colour give qualitative information and are called as qualitative variables[ 3 ] [ Figure 1 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g001.jpg

Classification of variables

Quantitative variables

Quantitative or numerical data are subdivided into discrete and continuous measurements. Discrete numerical data are recorded as a whole number such as 0, 1, 2, 3,… (integer), whereas continuous data can assume any value. Observations that can be counted constitute the discrete data and observations that can be measured constitute the continuous data. Examples of discrete data are number of episodes of respiratory arrests or the number of re-intubations in an intensive care unit. Similarly, examples of continuous data are the serial serum glucose levels, partial pressure of oxygen in arterial blood and the oesophageal temperature.

A hierarchical scale of increasing precision can be used for observing and recording the data which is based on categorical, ordinal, interval and ratio scales [ Figure 1 ].

Categorical or nominal variables are unordered. The data are merely classified into categories and cannot be arranged in any particular order. If only two categories exist (as in gender male and female), it is called as a dichotomous (or binary) data. The various causes of re-intubation in an intensive care unit due to upper airway obstruction, impaired clearance of secretions, hypoxemia, hypercapnia, pulmonary oedema and neurological impairment are examples of categorical variables.

Ordinal variables have a clear ordering between the variables. However, the ordered data may not have equal intervals. Examples are the American Society of Anesthesiologists status or Richmond agitation-sedation scale.

Interval variables are similar to an ordinal variable, except that the intervals between the values of the interval variable are equally spaced. A good example of an interval scale is the Fahrenheit degree scale used to measure temperature. With the Fahrenheit scale, the difference between 70° and 75° is equal to the difference between 80° and 85°: The units of measurement are equal throughout the full range of the scale.

Ratio scales are similar to interval scales, in that equal differences between scale values have equal quantitative meaning. However, ratio scales also have a true zero point, which gives them an additional property. For example, the system of centimetres is an example of a ratio scale. There is a true zero point and the value of 0 cm means a complete absence of length. The thyromental distance of 6 cm in an adult may be twice that of a child in whom it may be 3 cm.

STATISTICS: DESCRIPTIVE AND INFERENTIAL STATISTICS

Descriptive statistics[ 4 ] try to describe the relationship between variables in a sample or population. Descriptive statistics provide a summary of data in the form of mean, median and mode. Inferential statistics[ 4 ] use a random sample of data taken from a population to describe and make inferences about the whole population. It is valuable when it is not possible to examine each member of an entire population. The examples if descriptive and inferential statistics are illustrated in Table 1 .

Example of descriptive and inferential statistics

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g002.jpg

Descriptive statistics

The extent to which the observations cluster around a central location is described by the central tendency and the spread towards the extremes is described by the degree of dispersion.

Measures of central tendency

The measures of central tendency are mean, median and mode.[ 6 ] Mean (or the arithmetic average) is the sum of all the scores divided by the number of scores. Mean may be influenced profoundly by the extreme variables. For example, the average stay of organophosphorus poisoning patients in ICU may be influenced by a single patient who stays in ICU for around 5 months because of septicaemia. The extreme values are called outliers. The formula for the mean is

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g003.jpg

where x = each observation and n = number of observations. Median[ 6 ] is defined as the middle of a distribution in a ranked data (with half of the variables in the sample above and half below the median value) while mode is the most frequently occurring variable in a distribution. Range defines the spread, or variability, of a sample.[ 7 ] It is described by the minimum and maximum values of the variables. If we rank the data and after ranking, group the observations into percentiles, we can get better information of the pattern of spread of the variables. In percentiles, we rank the observations into 100 equal parts. We can then describe 25%, 50%, 75% or any other percentile amount. The median is the 50 th percentile. The interquartile range will be the observations in the middle 50% of the observations about the median (25 th -75 th percentile). Variance[ 7 ] is a measure of how spread out is the distribution. It gives an indication of how close an individual observation clusters about the mean value. The variance of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g004.jpg

where σ 2 is the population variance, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The variance of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g005.jpg

where s 2 is the sample variance, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. The formula for the variance of a population has the value ‘ n ’ as the denominator. The expression ‘ n −1’ is known as the degrees of freedom and is one less than the number of parameters. Each observation is free to vary, except the last one which must be a defined value. The variance is measured in squared units. To make the interpretation of the data simple and to retain the basic unit of observation, the square root of variance is used. The square root of the variance is the standard deviation (SD).[ 8 ] The SD of a population is defined by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g006.jpg

where σ is the population SD, X is the population mean, X i is the i th element from the population and N is the number of elements in the population. The SD of a sample is defined by slightly different formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g007.jpg

where s is the sample SD, x is the sample mean, x i is the i th element from the sample and n is the number of elements in the sample. An example for calculation of variation and SD is illustrated in Table 2 .

Example of mean, variance, standard deviation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g008.jpg

Normal distribution or Gaussian distribution

Most of the biological variables usually cluster around a central value, with symmetrical positive and negative deviations about this point.[ 1 ] The standard normal distribution curve is a symmetrical bell-shaped. In a normal distribution curve, about 68% of the scores are within 1 SD of the mean. Around 95% of the scores are within 2 SDs of the mean and 99% within 3 SDs of the mean [ Figure 2 ].

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g009.jpg

Normal distribution curve

Skewed distribution

It is a distribution with an asymmetry of the variables about its mean. In a negatively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the right of Figure 1 . In a positively skewed distribution [ Figure 3 ], the mass of the distribution is concentrated on the left of the figure leading to a longer right tail.

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g010.jpg

Curves showing negatively skewed and positively skewed distribution

Inferential statistics

In inferential statistics, data are analysed from a sample to make inferences in the larger collection of the population. The purpose is to answer or test the hypotheses. A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. Hypothesis tests are thus procedures for making rational decisions about the reality of observed effects.

Probability is the measure of the likelihood that an event will occur. Probability is quantified as a number between 0 and 1 (where 0 indicates impossibility and 1 indicates certainty).

In inferential statistics, the term ‘null hypothesis’ ( H 0 ‘ H-naught ,’ ‘ H-null ’) denotes that there is no relationship (difference) between the population variables in question.[ 9 ]

Alternative hypothesis ( H 1 and H a ) denotes that a statement between the variables is expected to be true.[ 9 ]

The P value (or the calculated probability) is the probability of the event occurring by chance if the null hypothesis is true. The P value is a numerical between 0 and 1 and is interpreted by researchers in deciding whether to reject or retain the null hypothesis [ Table 3 ].

P values with interpretation

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g011.jpg

If P value is less than the arbitrarily chosen value (known as α or the significance level), the null hypothesis (H0) is rejected [ Table 4 ]. However, if null hypotheses (H0) is incorrectly rejected, this is known as a Type I error.[ 11 ] Further details regarding alpha error, beta error and sample size calculation and factors influencing them are dealt with in another section of this issue by Das S et al .[ 12 ]

Illustration for null hypothesis

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g012.jpg

PARAMETRIC AND NON-PARAMETRIC TESTS

Numerical data (quantitative variables) that are normally distributed are analysed with parametric tests.[ 13 ]

Two most basic prerequisites for parametric statistical analysis are:

  • The assumption of normality which specifies that the means of the sample group are normally distributed
  • The assumption of equal variance which specifies that the variances of the samples and of their corresponding population are equal.

However, if the distribution of the sample is skewed towards one side or the distribution is unknown due to the small sample size, non-parametric[ 14 ] statistical techniques are used. Non-parametric tests are used to analyse ordinal and categorical data.

Parametric tests

The parametric tests assume that the data are on a quantitative (numerical) scale, with a normal distribution of the underlying population. The samples have the same variance (homogeneity of variances). The samples are randomly drawn from the population, and the observations within a group are independent of each other. The commonly used parametric tests are the Student's t -test, analysis of variance (ANOVA) and repeated measures ANOVA.

Student's t -test

Student's t -test is used to test the null hypothesis that there is no difference between the means of the two groups. It is used in three circumstances:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g013.jpg

where X = sample mean, u = population mean and SE = standard error of mean

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g014.jpg

where X 1 − X 2 is the difference between the means of the two groups and SE denotes the standard error of the difference.

  • To test if the population means estimated by two dependent samples differ significantly (the paired t -test). A usual setting for paired t -test is when measurements are made on the same subjects before and after a treatment.

The formula for paired t -test is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g015.jpg

where d is the mean difference and SE denotes the standard error of this difference.

The group variances can be compared using the F -test. The F -test is the ratio of variances (var l/var 2). If F differs significantly from 1.0, then it is concluded that the group variances differ significantly.

Analysis of variance

The Student's t -test cannot be used for comparison of three or more groups. The purpose of ANOVA is to test if there is any significant difference between the means of two or more groups.

In ANOVA, we study two variances – (a) between-group variability and (b) within-group variability. The within-group variability (error variance) is the variation that cannot be accounted for in the study design. It is based on random differences present in our samples.

However, the between-group (or effect variance) is the result of our treatment. These two estimates of variances are compared using the F-test.

A simplified formula for the F statistic is:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g016.jpg

where MS b is the mean squares between the groups and MS w is the mean squares within groups.

Repeated measures analysis of variance

As with ANOVA, repeated measures ANOVA analyses the equality of means of three or more groups. However, a repeated measure ANOVA is used when all variables of a sample are measured under different conditions or at different points in time.

As the variables are measured from a sample at different points of time, the measurement of the dependent variable is repeated. Using a standard ANOVA in this case is not appropriate because it fails to model the correlation between the repeated measures: The data violate the ANOVA assumption of independence. Hence, in the measurement of repeated dependent variables, repeated measures ANOVA should be used.

Non-parametric tests

When the assumptions of normality are not met, and the sample means are not normally, distributed parametric tests can lead to erroneous results. Non-parametric tests (distribution-free test) are used in such situation as they do not require the normality assumption.[ 15 ] Non-parametric tests may fail to detect a significant difference when compared with a parametric test. That is, they usually have less power.

As is done for the parametric tests, the test statistic is compared with known values for the sampling distribution of that statistic and the null hypothesis is accepted or rejected. The types of non-parametric analysis techniques and the corresponding parametric analysis techniques are delineated in Table 5 .

Analogue of parametric and non-parametric tests

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g017.jpg

Median test for one sample: The sign test and Wilcoxon's signed rank test

The sign test and Wilcoxon's signed rank test are used for median tests of one sample. These tests examine whether one instance of sample data is greater or smaller than the median reference value.

This test examines the hypothesis about the median θ0 of a population. It tests the null hypothesis H0 = θ0. When the observed value (Xi) is greater than the reference value (θ0), it is marked as+. If the observed value is smaller than the reference value, it is marked as − sign. If the observed value is equal to the reference value (θ0), it is eliminated from the sample.

If the null hypothesis is true, there will be an equal number of + signs and − signs.

The sign test ignores the actual values of the data and only uses + or − signs. Therefore, it is useful when it is difficult to measure the values.

Wilcoxon's signed rank test

There is a major limitation of sign test as we lose the quantitative information of the given data and merely use the + or – signs. Wilcoxon's signed rank test not only examines the observed values in comparison with θ0 but also takes into consideration the relative sizes, adding more statistical power to the test. As in the sign test, if there is an observed value that is equal to the reference value θ0, this observed value is eliminated from the sample.

Wilcoxon's rank sum test ranks all data points in order, calculates the rank sum of each sample and compares the difference in the rank sums.

Mann-Whitney test

It is used to test the null hypothesis that two samples have the same median or, alternatively, whether observations in one sample tend to be larger than observations in the other.

Mann–Whitney test compares all data (xi) belonging to the X group and all data (yi) belonging to the Y group and calculates the probability of xi being greater than yi: P (xi > yi). The null hypothesis states that P (xi > yi) = P (xi < yi) =1/2 while the alternative hypothesis states that P (xi > yi) ≠1/2.

Kolmogorov-Smirnov test

The two-sample Kolmogorov-Smirnov (KS) test was designed as a generic method to test whether two random samples are drawn from the same distribution. The null hypothesis of the KS test is that both distributions are identical. The statistic of the KS test is a distance between the two empirical distributions, computed as the maximum absolute difference between their cumulative curves.

Kruskal-Wallis test

The Kruskal–Wallis test is a non-parametric test to analyse the variance.[ 14 ] It analyses if there is any difference in the median values of three or more independent samples. The data values are ranked in an increasing order, and the rank sums calculated followed by calculation of the test statistic.

Jonckheere test

In contrast to Kruskal–Wallis test, in Jonckheere test, there is an a priori ordering that gives it a more statistical power than the Kruskal–Wallis test.[ 14 ]

Friedman test

The Friedman test is a non-parametric test for testing the difference between several related samples. The Friedman test is an alternative for repeated measures ANOVAs which is used when the same parameter has been measured under different conditions on the same subjects.[ 13 ]

Tests to analyse the categorical data

Chi-square test, Fischer's exact test and McNemar's test are used to analyse the categorical or nominal variables. The Chi-square test compares the frequencies and tests whether the observed data differ significantly from that of the expected data if there were no differences between groups (i.e., the null hypothesis). It is calculated by the sum of the squared difference between observed ( O ) and the expected ( E ) data (or the deviation, d ) divided by the expected data by the following formula:

An external file that holds a picture, illustration, etc.
Object name is IJA-60-662-g018.jpg

A Yates correction factor is used when the sample size is small. Fischer's exact test is used to determine if there are non-random associations between two categorical variables. It does not assume random sampling, and instead of referring a calculated statistic to a sampling distribution, it calculates an exact probability. McNemar's test is used for paired nominal data. It is applied to 2 × 2 table with paired-dependent samples. It is used to determine whether the row and column frequencies are equal (that is, whether there is ‘marginal homogeneity’). The null hypothesis is that the paired proportions are equal. The Mantel-Haenszel Chi-square test is a multivariate test as it analyses multiple grouping variables. It stratifies according to the nominated confounding variables and identifies any that affects the primary outcome variable. If the outcome variable is dichotomous, then logistic regression is used.

SOFTWARES AVAILABLE FOR STATISTICS, SAMPLE SIZE CALCULATION AND POWER ANALYSIS

Numerous statistical software systems are available currently. The commonly used software systems are Statistical Package for the Social Sciences (SPSS – manufactured by IBM corporation), Statistical Analysis System ((SAS – developed by SAS Institute North Carolina, United States of America), R (designed by Ross Ihaka and Robert Gentleman from R core team), Minitab (developed by Minitab Inc), Stata (developed by StataCorp) and the MS Excel (developed by Microsoft).

There are a number of web resources which are related to statistical power analyses. A few are:

  • StatPages.net – provides links to a number of online power calculators
  • G-Power – provides a downloadable power analysis program that runs under DOS
  • Power analysis for ANOVA designs an interactive site that calculates power or sample size needed to attain a given power for one effect in a factorial ANOVA design
  • SPSS makes a program called SamplePower. It gives an output of a complete report on the computer screen which can be cut and paste into another document.

It is important that a researcher knows the concepts of the basic statistical methods used for conduct of a research study. This will help to conduct an appropriately well-designed study leading to valid and reliable results. Inappropriate use of statistical techniques may lead to faulty conclusions, inducing errors and undermining the significance of the article. Bad statistics may lead to bad research, and bad research may lead to unethical practice. Hence, an adequate knowledge of statistics and the appropriate use of statistical tests are important. An appropriate knowledge about the basic statistical methods will go a long way in improving the research designs and producing quality medical research which can be utilised for formulating the evidence-based guidelines.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

The 7 Most Useful Data Analysis Methods and Techniques

Data analytics is the process of analyzing raw data to draw out meaningful insights. These insights are then used to determine the best course of action.

When is the best time to roll out that marketing campaign? Is the current team structure as effective as it could be? Which customer segments are most likely to purchase your new product?

Ultimately, data analytics is a crucial driver of any successful business strategy. But how do data analysts actually turn raw data into something useful? There are a range of methods and techniques that data analysts use depending on the type of data in question and the kinds of insights they want to uncover.

You can get a hands-on introduction to data analytics in this free short course .

In this post, we’ll explore some of the most useful data analysis techniques. By the end, you’ll have a much clearer idea of how you can transform meaningless data into business intelligence. We’ll cover:

  • What is data analysis and why is it important?
  • What is the difference between qualitative and quantitative data?
  • Regression analysis
  • Monte Carlo simulation
  • Factor analysis
  • Cohort analysis
  • Cluster analysis
  • Time series analysis
  • Sentiment analysis
  • The data analysis process
  • The best tools for data analysis
  •  Key takeaways

The first six methods listed are used for quantitative data , while the last technique applies to qualitative data. We briefly explain the difference between quantitative and qualitative data in section two, but if you want to skip straight to a particular analysis technique, just use the clickable menu.

1. What is data analysis and why is it important?

Data analysis is, put simply, the process of discovering useful information by evaluating data. This is done through a process of inspecting, cleaning, transforming, and modeling data using analytical and statistical tools, which we will explore in detail further along in this article.

Why is data analysis important? Analyzing data effectively helps organizations make business decisions. Nowadays, data is collected by businesses constantly: through surveys, online tracking, online marketing analytics, collected subscription and registration data (think newsletters), social media monitoring, among other methods.

These data will appear as different structures, including—but not limited to—the following:

The concept of big data —data that is so large, fast, or complex, that it is difficult or impossible to process using traditional methods—gained momentum in the early 2000s. Then, Doug Laney, an industry analyst, articulated what is now known as the mainstream definition of big data as the three Vs: volume, velocity, and variety. 

  • Volume: As mentioned earlier, organizations are collecting data constantly. In the not-too-distant past it would have been a real issue to store, but nowadays storage is cheap and takes up little space.
  • Velocity: Received data needs to be handled in a timely manner. With the growth of the Internet of Things, this can mean these data are coming in constantly, and at an unprecedented speed.
  • Variety: The data being collected and stored by organizations comes in many forms, ranging from structured data—that is, more traditional, numerical data—to unstructured data—think emails, videos, audio, and so on. We’ll cover structured and unstructured data a little further on.

This is a form of data that provides information about other data, such as an image. In everyday life you’ll find this by, for example, right-clicking on a file in a folder and selecting “Get Info”, which will show you information such as file size and kind, date of creation, and so on.

Real-time data

This is data that is presented as soon as it is acquired. A good example of this is a stock market ticket, which provides information on the most-active stocks in real time.

Machine data

This is data that is produced wholly by machines, without human instruction. An example of this could be call logs automatically generated by your smartphone.

Quantitative and qualitative data

Quantitative data—otherwise known as structured data— may appear as a “traditional” database—that is, with rows and columns. Qualitative data—otherwise known as unstructured data—are the other types of data that don’t fit into rows and columns, which can include text, images, videos and more. We’ll discuss this further in the next section.

2. What is the difference between quantitative and qualitative data?

How you analyze your data depends on the type of data you’re dealing with— quantitative or qualitative . So what’s the difference?

Quantitative data is anything measurable , comprising specific quantities and numbers. Some examples of quantitative data include sales figures, email click-through rates, number of website visitors, and percentage revenue increase. Quantitative data analysis techniques focus on the statistical, mathematical, or numerical analysis of (usually large) datasets. This includes the manipulation of statistical data using computational techniques and algorithms. Quantitative analysis techniques are often used to explain certain phenomena or to make predictions.

Qualitative data cannot be measured objectively , and is therefore open to more subjective interpretation. Some examples of qualitative data include comments left in response to a survey question, things people have said during interviews, tweets and other social media posts, and the text included in product reviews. With qualitative data analysis, the focus is on making sense of unstructured data (such as written text, or transcripts of spoken conversations). Often, qualitative analysis will organize the data into themes—a process which, fortunately, can be automated.

Data analysts work with both quantitative and qualitative data , so it’s important to be familiar with a variety of analysis methods. Let’s take a look at some of the most useful techniques now.

3. Data analysis techniques

Now we’re familiar with some of the different types of data, let’s focus on the topic at hand: different methods for analyzing data. 

a. Regression analysis

Regression analysis is used to estimate the relationship between a set of variables. When conducting any type of regression analysis , you’re looking to see if there’s a correlation between a dependent variable (that’s the variable or outcome you want to measure or predict) and any number of independent variables (factors which may have an impact on the dependent variable). The aim of regression analysis is to estimate how one or more variables might impact the dependent variable, in order to identify trends and patterns. This is especially useful for making predictions and forecasting future trends.

Let’s imagine you work for an ecommerce company and you want to examine the relationship between: (a) how much money is spent on social media marketing, and (b) sales revenue. In this case, sales revenue is your dependent variable—it’s the factor you’re most interested in predicting and boosting. Social media spend is your independent variable; you want to determine whether or not it has an impact on sales and, ultimately, whether it’s worth increasing, decreasing, or keeping the same. Using regression analysis, you’d be able to see if there’s a relationship between the two variables. A positive correlation would imply that the more you spend on social media marketing, the more sales revenue you make. No correlation at all might suggest that social media marketing has no bearing on your sales. Understanding the relationship between these two variables would help you to make informed decisions about the social media budget going forward. However: It’s important to note that, on their own, regressions can only be used to determine whether or not there is a relationship between a set of variables—they don’t tell you anything about cause and effect. So, while a positive correlation between social media spend and sales revenue may suggest that one impacts the other, it’s impossible to draw definitive conclusions based on this analysis alone.

There are many different types of regression analysis, and the model you use depends on the type of data you have for the dependent variable. For example, your dependent variable might be continuous (i.e. something that can be measured on a continuous scale, such as sales revenue in USD), in which case you’d use a different type of regression analysis than if your dependent variable was categorical in nature (i.e. comprising values that can be categorised into a number of distinct groups based on a certain characteristic, such as customer location by continent). You can learn more about different types of dependent variables and how to choose the right regression analysis in this guide .

Regression analysis in action: Investigating the relationship between clothing brand Benetton’s advertising expenditure and sales

b. Monte Carlo simulation

When making decisions or taking certain actions, there are a range of different possible outcomes. If you take the bus, you might get stuck in traffic. If you walk, you might get caught in the rain or bump into your chatty neighbor, potentially delaying your journey. In everyday life, we tend to briefly weigh up the pros and cons before deciding which action to take; however, when the stakes are high, it’s essential to calculate, as thoroughly and accurately as possible, all the potential risks and rewards.

Monte Carlo simulation, otherwise known as the Monte Carlo method, is a computerized technique used to generate models of possible outcomes and their probability distributions. It essentially considers a range of possible outcomes and then calculates how likely it is that each particular outcome will be realized. The Monte Carlo method is used by data analysts to conduct advanced risk analysis, allowing them to better forecast what might happen in the future and make decisions accordingly.

So how does Monte Carlo simulation work, and what can it tell us? To run a Monte Carlo simulation, you’ll start with a mathematical model of your data—such as a spreadsheet. Within your spreadsheet, you’ll have one or several outputs that you’re interested in; profit, for example, or number of sales. You’ll also have a number of inputs; these are variables that may impact your output variable. If you’re looking at profit, relevant inputs might include the number of sales, total marketing spend, and employee salaries. If you knew the exact, definitive values of all your input variables, you’d quite easily be able to calculate what profit you’d be left with at the end. However, when these values are uncertain, a Monte Carlo simulation enables you to calculate all the possible options and their probabilities. What will your profit be if you make 100,000 sales and hire five new employees on a salary of $50,000 each? What is the likelihood of this outcome? What will your profit be if you only make 12,000 sales and hire five new employees? And so on. It does this by replacing all uncertain values with functions which generate random samples from distributions determined by you, and then running a series of calculations and recalculations to produce models of all the possible outcomes and their probability distributions. The Monte Carlo method is one of the most popular techniques for calculating the effect of unpredictable variables on a specific output variable, making it ideal for risk analysis.

Monte Carlo simulation in action: A case study using Monte Carlo simulation for risk analysis

 c. Factor analysis

Factor analysis is a technique used to reduce a large number of variables to a smaller number of factors. It works on the basis that multiple separate, observable variables correlate with each other because they are all associated with an underlying construct. This is useful not only because it condenses large datasets into smaller, more manageable samples, but also because it helps to uncover hidden patterns. This allows you to explore concepts that cannot be easily measured or observed—such as wealth, happiness, fitness, or, for a more business-relevant example, customer loyalty and satisfaction.

Let’s imagine you want to get to know your customers better, so you send out a rather long survey comprising one hundred questions. Some of the questions relate to how they feel about your company and product; for example, “Would you recommend us to a friend?” and “How would you rate the overall customer experience?” Other questions ask things like “What is your yearly household income?” and “How much are you willing to spend on skincare each month?”

Once your survey has been sent out and completed by lots of customers, you end up with a large dataset that essentially tells you one hundred different things about each customer (assuming each customer gives one hundred responses). Instead of looking at each of these responses (or variables) individually, you can use factor analysis to group them into factors that belong together—in other words, to relate them to a single underlying construct. In this example, factor analysis works by finding survey items that are strongly correlated. This is known as covariance . So, if there’s a strong positive correlation between household income and how much they’re willing to spend on skincare each month (i.e. as one increases, so does the other), these items may be grouped together. Together with other variables (survey responses), you may find that they can be reduced to a single factor such as “consumer purchasing power”. Likewise, if a customer experience rating of 10/10 correlates strongly with “yes” responses regarding how likely they are to recommend your product to a friend, these items may be reduced to a single factor such as “customer satisfaction”.

In the end, you have a smaller number of factors rather than hundreds of individual variables. These factors are then taken forward for further analysis, allowing you to learn more about your customers (or any other area you’re interested in exploring).

Factor analysis in action: Using factor analysis to explore customer behavior patterns in Tehran

d. Cohort analysis

Cohort analysis is a data analytics technique that groups users based on a shared characteristic , such as the date they signed up for a service or the product they purchased. Once users are grouped into cohorts, analysts can track their behavior over time to identify trends and patterns.

So what does this mean and why is it useful? Let’s break down the above definition further. A cohort is a group of people who share a common characteristic (or action) during a given time period. Students who enrolled at university in 2020 may be referred to as the 2020 cohort. Customers who purchased something from your online store via the app in the month of December may also be considered a cohort.

With cohort analysis, you’re dividing your customers or users into groups and looking at how these groups behave over time. So, rather than looking at a single, isolated snapshot of all your customers at a given moment in time (with each customer at a different point in their journey), you’re examining your customers’ behavior in the context of the customer lifecycle. As a result, you can start to identify patterns of behavior at various points in the customer journey—say, from their first ever visit to your website, through to email newsletter sign-up, to their first purchase, and so on. As such, cohort analysis is dynamic, allowing you to uncover valuable insights about the customer lifecycle.

This is useful because it allows companies to tailor their service to specific customer segments (or cohorts). Let’s imagine you run a 50% discount campaign in order to attract potential new customers to your website. Once you’ve attracted a group of new customers (a cohort), you’ll want to track whether they actually buy anything and, if they do, whether or not (and how frequently) they make a repeat purchase. With these insights, you’ll start to gain a much better understanding of when this particular cohort might benefit from another discount offer or retargeting ads on social media, for example. Ultimately, cohort analysis allows companies to optimize their service offerings (and marketing) to provide a more targeted, personalized experience. You can learn more about how to run cohort analysis using Google Analytics .

Cohort analysis in action: How Ticketmaster used cohort analysis to boost revenue

e. Cluster analysis

Cluster analysis is an exploratory technique that seeks to identify structures within a dataset. The goal of cluster analysis is to sort different data points into groups (or clusters) that are internally homogeneous and externally heterogeneous. This means that data points within a cluster are similar to each other, and dissimilar to data points in another cluster. Clustering is used to gain insight into how data is distributed in a given dataset, or as a preprocessing step for other algorithms.

There are many real-world applications of cluster analysis. In marketing, cluster analysis is commonly used to group a large customer base into distinct segments, allowing for a more targeted approach to advertising and communication. Insurance firms might use cluster analysis to investigate why certain locations are associated with a high number of insurance claims. Another common application is in geology, where experts will use cluster analysis to evaluate which cities are at greatest risk of earthquakes (and thus try to mitigate the risk with protective measures).

It’s important to note that, while cluster analysis may reveal structures within your data, it won’t explain why those structures exist. With that in mind, cluster analysis is a useful starting point for understanding your data and informing further analysis. Clustering algorithms are also used in machine learning—you can learn more about clustering in machine learning in our guide .

Cluster analysis in action: Using cluster analysis for customer segmentation—a telecoms case study example

f. Time series analysis

Time series analysis is a statistical technique used to identify trends and cycles over time. Time series data is a sequence of data points which measure the same variable at different points in time (for example, weekly sales figures or monthly email sign-ups). By looking at time-related trends, analysts are able to forecast how the variable of interest may fluctuate in the future.

When conducting time series analysis, the main patterns you’ll be looking out for in your data are:

  • Trends: Stable, linear increases or decreases over an extended time period.
  • Seasonality: Predictable fluctuations in the data due to seasonal factors over a short period of time. For example, you might see a peak in swimwear sales in summer around the same time every year.
  • Cyclic patterns: Unpredictable cycles where the data fluctuates. Cyclical trends are not due to seasonality, but rather, may occur as a result of economic or industry-related conditions.

As you can imagine, the ability to make informed predictions about the future has immense value for business. Time series analysis and forecasting is used across a variety of industries, most commonly for stock market analysis, economic forecasting, and sales forecasting. There are different types of time series models depending on the data you’re using and the outcomes you want to predict. These models are typically classified into three broad types: the autoregressive (AR) models, the integrated (I) models, and the moving average (MA) models. For an in-depth look at time series analysis, refer to our guide .

Time series analysis in action: Developing a time series model to predict jute yarn demand in Bangladesh

g. Sentiment analysis

When you think of data, your mind probably automatically goes to numbers and spreadsheets.

Many companies overlook the value of qualitative data, but in reality, there are untold insights to be gained from what people (especially customers) write and say about you. So how do you go about analyzing textual data?

One highly useful qualitative technique is sentiment analysis , a technique which belongs to the broader category of text analysis —the (usually automated) process of sorting and understanding textual data.

With sentiment analysis, the goal is to interpret and classify the emotions conveyed within textual data. From a business perspective, this allows you to ascertain how your customers feel about various aspects of your brand, product, or service.

There are several different types of sentiment analysis models, each with a slightly different focus. The three main types include:

Fine-grained sentiment analysis

If you want to focus on opinion polarity (i.e. positive, neutral, or negative) in depth, fine-grained sentiment analysis will allow you to do so.

For example, if you wanted to interpret star ratings given by customers, you might use fine-grained sentiment analysis to categorize the various ratings along a scale ranging from very positive to very negative.

Emotion detection

This model often uses complex machine learning algorithms to pick out various emotions from your textual data.

You might use an emotion detection model to identify words associated with happiness, anger, frustration, and excitement, giving you insight into how your customers feel when writing about you or your product on, say, a product review site.

Aspect-based sentiment analysis

This type of analysis allows you to identify what specific aspects the emotions or opinions relate to, such as a certain product feature or a new ad campaign.

If a customer writes that they “find the new Instagram advert so annoying”, your model should detect not only a negative sentiment, but also the object towards which it’s directed.

In a nutshell, sentiment analysis uses various Natural Language Processing (NLP) algorithms and systems which are trained to associate certain inputs (for example, certain words) with certain outputs.

For example, the input “annoying” would be recognized and tagged as “negative”. Sentiment analysis is crucial to understanding how your customers feel about you and your products, for identifying areas for improvement, and even for averting PR disasters in real-time!

Sentiment analysis in action: 5 Real-world sentiment analysis case studies

4. The data analysis process

In order to gain meaningful insights from data, data analysts will perform a rigorous step-by-step process. We go over this in detail in our step by step guide to the data analysis process —but, to briefly summarize, the data analysis process generally consists of the following phases:

Defining the question

The first step for any data analyst will be to define the objective of the analysis, sometimes called a ‘problem statement’. Essentially, you’re asking a question with regards to a business problem you’re trying to solve. Once you’ve defined this, you’ll then need to determine which data sources will help you answer this question.

Collecting the data

Now that you’ve defined your objective, the next step will be to set up a strategy for collecting and aggregating the appropriate data. Will you be using quantitative (numeric) or qualitative (descriptive) data? Do these data fit into first-party, second-party, or third-party data?

Learn more: Quantitative vs. Qualitative Data: What’s the Difference? 

Cleaning the data

Unfortunately, your collected data isn’t automatically ready for analysis—you’ll have to clean it first. As a data analyst, this phase of the process will take up the most time. During the data cleaning process, you will likely be:

  • Removing major errors, duplicates, and outliers
  • Removing unwanted data points
  • Structuring the data—that is, fixing typos, layout issues, etc.
  • Filling in major gaps in data

Analyzing the data

Now that we’ve finished cleaning the data, it’s time to analyze it! Many analysis methods have already been described in this article, and it’s up to you to decide which one will best suit the assigned objective. It may fall under one of the following categories:

  • Descriptive analysis , which identifies what has already happened
  • Diagnostic analysis , which focuses on understanding why something has happened
  • Predictive analysis , which identifies future trends based on historical data
  • Prescriptive analysis , which allows you to make recommendations for the future

Visualizing and sharing your findings

We’re almost at the end of the road! Analyses have been made, insights have been gleaned—all that remains to be done is to share this information with others. This is usually done with a data visualization tool, such as Google Charts, or Tableau.

Learn more: 13 of the Most Common Types of Data Visualization

To sum up the process, Will’s explained it all excellently in the following video:

5. The best tools for data analysis

As you can imagine, every phase of the data analysis process requires the data analyst to have a variety of tools under their belt that assist in gaining valuable insights from data. We cover these tools in greater detail in this article , but, in summary, here’s our best-of-the-best list, with links to each product:

The top 9 tools for data analysts

  • Microsoft Excel
  • Jupyter Notebook
  • Apache Spark
  • Microsoft Power BI

6. Key takeaways and further reading

As you can see, there are many different data analysis techniques at your disposal. In order to turn your raw data into actionable insights, it’s important to consider what kind of data you have (is it qualitative or quantitative?) as well as the kinds of insights that will be useful within the given context. In this post, we’ve introduced seven of the most useful data analysis techniques—but there are many more out there to be discovered!

So what now? If you haven’t already, we recommend reading the case studies for each analysis technique discussed in this post (you’ll find a link at the end of each section). For a more hands-on introduction to the kinds of methods and techniques that data analysts use, try out this free introductory data analytics short course. In the meantime, you might also want to read the following:

  • The Best Online Data Analytics Courses for 2024
  • What Is Time Series Data and How Is It Analyzed?
  • What is Spatial Analysis?
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case AskWhy Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

research data analysis tools

Home Market Research

Exploring Top 15 Data Analysis Tools to Elevate Your Insights

data analysis tools

Data is everywhere, and understanding it can be a superpower. Imagine having a friend who helps you make sense of all the information around you. Well, that’s what data analysis tools do!

These tools act as your friendly assistants, making the complex world of data understandable and actionable. Whether you’re a seasoned data scientist crunching numbers for insights, a business analyst making strategic decisions, or someone new to the data game, these top 15 data analysis tools bring diverse features to the table.

From creating visual stories to unraveling patterns in data, these tools empower you to gain valuable insights. Picture them as your digital sidekicks, simplifying data and turning it into actionable intelligence. 

So, whether you’re thinking of boosting business strategies or just curious about the stories your data can tell, these tools are here to guide you through the fascinating journey of data exploration. Let’s dive into the details and discover how each tool can enhance your analytical superpowers!

What is Data Analysis?

Data analysis is the process of inspecting, cleaning, transforming, and modeling data to discover useful information, draw conclusions, and support decision-making. It involves a variety of techniques and methods to uncover patterns, trends, correlations, and insights from raw data.

It is widely used in various fields, including business, finance, healthcare, science, and social sciences, to make informed decisions and drive improvements based on evidence and insights derived from data. It plays a crucial role in extracting valuable knowledge from the vast amounts of data generated in today’s digital age.

What are Data Analysis Tools?

Data analysis tools refer to software and applications designed to collect, clean, process, analyze, and visualize data. These tools help individuals and organizations make informed decisions by extracting meaningful insights from raw data. Data analysis tools can vary widely in their features, capabilities, and complexity.

The choice of a data analysis tool depends on factors such as the data’s nature, the analysis’s complexity, user expertise, and specific requirements. Analysts and data scientists often use a combination of tools to address different aspects of the data analysis workflow.

Why are Data Analysis Tools Important for Your Business?

Data analysis tools are essential for your business for several reasons, as they play a pivotal role in extracting valuable insights from your data. Here are some key reasons why data analysis tools are important for your business:

Informed Decision-Making

Data analysis tools serve as your compass in decision-making. By thoroughly examining historical data and current data, these tools provide a solid foundation for making choices that are rooted in evidence and data insights. This ensures that your decisions are well-informed, reducing reliance on intuition and increasing the likelihood of successful outcomes.

Competitive Advantage

Data analysis tools act as your strategic companion, uncovering market trends, deciphering customer preferences, and identifying industry benchmarks. This wealth of information enables your business to adapt proactively, capitalize on emerging opportunities, and maintain a competitive advantage over others in the market.

Efficient Operations

Data analysis tools are like efficiency boosters for business operations. By delving into internal data, they help pinpoint areas of inefficiency, streamline workflows, and optimize resource allocation. The result is a finely tuned operational machine that maximizes output while minimizing unnecessary costs and efforts.

Customer Insights

Understanding your customers is at the heart of successful business strategies. Data analysis tools offer a magnifying glass into customer behavior, preferences, and feedback. Armed with these insights, you can tailor your marketing strategies, personalize customer experiences, and ultimately enhance overall satisfaction. This deeper connection with your customer base can build loyalty and drive business growth.

Risk Management

Navigating the business landscape involves dealing with uncertainties and risks. Data analysis tools function as risk detectors. By proactively managing risks, your business is better positioned to weather challenges, seize opportunities, and maintain a resilient and adaptive stance in the market.

Types of Data Analytics Tools

A data analytics tool comes in various forms, each designed to serve specific needs within the data analysis process. Here are some common types of data analytics tools:

  • Statistical Analysis Tools: Conducting statistical analyses, hypothesis testing, and regression analysis to extract insights from data.
  • Data Visualization Tools: Creating visual representations of data through charts, graphs, and dashboards for easier interpretation.
  • Programming Languages: Writing custom code for data analysis, manipulation, and visualization. Libraries like Pandas and Matplotlib enhance functionality.
  • Database Management Systems (DBMS): Storing, managing, and retrieving structured data efficiently for analysis.
  • Business Intelligence (BI) Tools: Translating raw data into actionable insights through interactive dashboards and reports for strategic decision-making.
  • Text Analytics Tools: Extracting insights and patterns from textual data through techniques like sentiment analysis and language processing.
  • Big Data Tools: Processing and analyzing large volumes of structured and unstructured data in a distributed computing environment.
  • Data Wrangling Tools: Cleaning, transforming, and preparing raw data for analysis.

These tools cater to different stages of the data analysis process and offer diverse functionalities. Depending on the specific requirements of a data analysis task, analysts may choose a combination of these tools to achieve their objectives efficiently.

What are The Factors to Consider When Choosing a Data Analysis Tool?

Choosing a data analysis software requires careful consideration of your specific needs, the nature of your data, and your team’s skills. Here’s a step-by-step guide to help you make an informed decision:

Define Your Objectives

  • Clearly outline your goals and objectives for data analysis.
  • Identify the specific tasks and analyses you need to perform.

Understand Your Data

  • Consider the size, complexity, and format of your data.
  • Evaluate the types of data sources you’ll be working with (structured, unstructured, semi-structured).

Consider Your Technical Environment

  • Assess the compatibility of the tool with your existing systems and technologies.
  • Check if the tool supports your organization’s programming languages and frameworks.

Ease of Use

  • Evaluate the user-friendliness of the tool, especially if your team includes non-technical users.
  • Look for tools with intuitive interfaces and good documentation.

Scalability

  • Consider the scalability of the tool to handle growing datasets and increasing analysis complexity.
  • Check if the tool supports parallel processing and distributed computing.

Supported Analysis Techniques

  • Ensure the tool supports the statistical and machine learning techniques relevant to your analysis requirements.
  • Check for the availability of libraries and packages for advanced analytics.

Integration Capabilities

  • Assess how well the tool integrates with other tools and platforms in your data ecosystem.
  • Consider the ability to connect to different data sources.

Cost and Licensing

  • Evaluate the cost structure, including licensing fees, maintenance, and support costs.
  • Consider open-source options if budget constraints are a concern.

Community and Support

  • Check the user community and support resources for the tool.
  • Look for active forums, documentation, and the availability of training materials.

Top 15 Data Analysis Tools to Elevate Your Insights

Whether you’re an experienced data scientist or a business professional interested in unlocking the potential of data, the tools listed below shine as outstanding options to enhance your analytical pursuits. Let’s explore:

1. QuestionPro

QuestionPro is a versatile platform known for its survey and research capabilities. While traditionally recognized for its survey functionalities, it has expanded to offer basic data analysis tools. It also provides a user-friendly interface for users to analyze and visualize survey data.

How it Works:

QuestionPro simplifies the data analysis process by allowing users to create surveys, collect responses, and analyze the gathered data. The platform provides basic tools for generating reports and visualizations based on survey responses.

  • Survey customization options.
  • Real-time reporting features.
  • Integration capabilities.
  • Export data in various formats.
  • Rich visualizations.
  • Limited advanced analytics features.

QuestionPro provides a variety of pricing plans tailored to suit businesses of different sizes. Starting at $99 per user per month. The platform also offers custom pricing options for enterprise-level solutions. To allow users to explore its features before committing, QuestionPro offers a free trial.

Tableau is a powerhouse in data visualization and business intelligence. Renowned for its ability to turn complex datasets into interactive visualizations, it is a go-to tool for professionals seeking to make data-driven decisions.

Tableau connects to various data sources, allowing users to create interactive visualizations and dashboards. Its drag-and-drop interface makes it accessible, while its extensive range of visualization options caters to diverse analytical needs.

  • Excellent for data exploration and presentation.
  • Strong community and support.
  • Integrates with various data sources.
  • Cost may be a barrier for smaller organizations.

Tableau offers a monthly pricing plan at $75, providing users with access to its powerful data visualization and business intelligence tools.

3. Google Data Studio

Google Data Studio is a free and intuitive tool for creating interactive dashboards and reports. Developed by Google, it seamlessly integrates with other Google products and external data sources.

Google Data Studio enables users to connect to various data sources, design customizable reports and dashboards, and share insights with team members. Its drag-and-drop interface makes it easy for users to create visually appealing data presentations.

  • Free to use with basic features.
  • Seamless integration with Google products.
  • User-friendly drag-and-drop interface.
  • Easy sharing options.

4. Microsoft Power BI

Microsoft Power BI is a comprehensive business analytics tool that empowers users to visualize and share insights across organizations or embed them in applications and websites.

Power BI connects to various data sources, including Microsoft services. Users can create interactive reports and dashboards, share insights, and leverage AI-powered analytics for advanced data exploration.

  • Seamless integration with Microsoft products.
  • Robust analytics capabilities.
  • Scalable for enterprise use.
  • Extensive visualization options.
  • Licensing costs can be high.

Microsoft Power BI offers a customized pricing model. It allows businesses to tailor their investment based on specific needs and requirements.

5. Qlik Sense

Qlik Sense is a data analytics software and business intelligence platform known for its associative data modeling. It offers users flexibility in data exploration and visualization.

Qlik Sense allows users to load data from various sources, associate and visualize data without predefined queries, create interactive dashboards, and share insights for collaborative decision-making.

  • Associative data model for flexible exploration.
  • Powerful data visualization capabilities.
  • Collaborative features for team analysis.
  • Qlik DataMarket for external data integration.
  • Limited customization options for certain visual elements.

Qlik Sense employs a customized pricing model, allowing businesses to structure their investments according to their distinct analytical needs.

6. Zoho Analytics

Zoho Analytics is a cloud-based business intelligence and analytics platform designed to help users create reports and dashboards for informed decision-making.

Users can import data from various sources, build reports and dashboards using a drag-and-drop interface, analyze data with AI-powered insights, and collaborate with team members.

  • Extensive integration options.
  • AI-powered insights.
  • Collaboration and sharing features.
  • It may not be as feature-rich as some premium tools.
  • Zoho Analytics offers various pricing plans, starting from free for basic usage. Paid plans range from affordable options suitable for small businesses to more extensive plans for larger enterprises, providing flexibility based on organizational needs.

SAS (Statistical Analysis System) is a software suite known for advanced analytics, business intelligence, and data management. It offers powerful statistical analysis capabilities.

SAS allows users to import and manage data from various sources, perform advanced statistical analyses, generate reports and visualizations, and deploy models for predictive analytics.

  • Comprehensive statistical analysis capabilities.
  • Handles large datasets efficiently.
  • Advanced analytics and machine learning features.
  • Strong data security measures.
  • Extensive industry usage.
  • Limited integration options with certain data sources.
  • SAS pricing is often customized based on specific business needs, so organizations must contact SAS directly for a tailored quote.

8. Google Analytics

Google Analytics is a web analytics service that provides insights into website and app usage. While not a traditional data analysis tool, it is valuable for understanding user behavior.

By implementing tracking code on the website or app, users can collect data on user interactions, analyze user behavior through reports, and make data-driven decisions for website optimization.

  • Free basic version available.
  • Integrates with other Google products.
  • Real-time reporting.
  • Customizable reporting.
  • Limited to web and app analytics.
  • Google Analytics offers a free version with basic features. Advanced features are available through Google Analytics 360, with pricing based on user requirements.

Splunk is a powerful platform designed to search, monitor, and analyze machine-generated data. It is valuable for IT operations, security, and business analytics.

Users can ingest machine data from various sources, search and analyze data in real-time, create dashboards for monitoring and visualization, and gain insights into system performance and security.

  • Real-time data analysis and monitoring.
  • Scalable for large-scale data environments.
  • Powerful search and visualization capabilities.
  • App ecosystem for extended functionality.
  • Effective for IT and security analytics.
  • The GUI-based interface may require adaptation for certain users.
  • Splunk pricing varies based on factors such as data volume and features required. Organizations can contact Splunk for a personalized quote.

Looker is a business intelligence and data exploration platform that allows users to create and share reports and dashboards. It also provides a unified view of data across an organization.

Looker operates on a model-centric approach, where users define a semantic layer (LookML) to abstract data complexities. Users can then create and customize interactive dashboards and reports using a web-based interface.

  • Cohesive data experience.
  • Real-time data exploration and analysis.
  • Powerful collaboration and sharing features.
  • Looker’s pricing model is flexible, and organizations need to contact Looker directly for a customized quote based on their specific requirements.

Python is the adventurous explorer in the world of coding. While not exclusive to data analysis, it has become a popular language for a data scientist and data analyst. With its simplicity and versatility, Python opens up a world of possibilities for those who want to take their data analysis skills to the next level.

Users can leverage Python’s extensive libraries to import, clean, analyze, and visualize data. Jupyter Notebooks, an interactive coding environment, enhances collaboration and documentation, making it a popular choice among data analysts and scientists.

  • Open-source and widely used.
  • Extensive libraries for data analysis and machine learning algorithms.
  • High flexibility and customization.
  • Strong community support.
  • Limited native reporting features.
  • It may lack the user-friendly interface of some GUI-based tools.
  • Python is an open-source language, and its libraries are freely available for use. There are no licensing costs associated with Python itself.

R is a programming language and environment specifically designed for statistical computing and graphics. R is widely used in academia and industry and offers a vast array of statistical and data analysis packages.

R packages allow users to perform statistical analyses, manipulate data, and visualize data. RStudio, a popular integrated development environment (IDE), enhances the coding experience and facilitates the creation of reproducible reports.

  • Robust data visualization options.
  • Strong support for reproducible research.
  • Active and engaged user community.
  • It may not be as versatile as general-purpose languages.
  • Limited scalability for big data.
  • R is open-source, and R packages are freely available. RStudio, while offering a free version, has a commercial version with additional features.

13. Jupyter Notebook

Jupyter Notebook is an open-source web application that allows users to create and share documents containing live code, equations, visualizations, and narrative text. It supports various programming languages, including Python and R.

Users can create interactive documents containing code cells that can be executed sequentially. Jupyter Notebooks are widely used in data analysis, machine learning, and collaborative research, offering a flexible and accessible environment.

  • Supports multiple programming languages.
  • Interactive and collaborative coding environment.
  • Allows integration of code, visualizations, and narrative.
  • Easily shareable and reproducible.
  • It may not be as feature-rich as specialized tools.
  • Jupyter Notebook is open-source and freely available. Users can install it locally or use cloud-based platforms.

KNIME (Konstanz Information Miner) is an open-source data analytics, reporting, and integration platform. It allows users to visually design data workflows, incorporating various data processing and analysis tasks.

Users can drag and drop nodes to design workflows, incorporating data preprocessing, analysis, and visualization tasks. KNIME supports various plugins and integrations with other tools, providing flexibility in data analysis.

  • Visual workflow design for easy understanding.
  • Active community and extensive documentation.
  • Integrates with numerous data sources and formats.
  • Suitable for users with varying technical expertise.
  • Limited scalability for very large datasets.
  • KNIME Analytics Platform is open-source, and additional commercial extensions are available for advanced functionalities.

15. RapidMiner

RapidMiner is an integrated data science platform that combines data preparation, machine learning, and predictive modeling. It aims to simplify complex data science processes for users with varying skill levels.

Users can design data pipelines by visually connecting pre-built operators. RapidMiner provides machine learning and analytics capabilities, making it suitable for tasks ranging from data preprocessing to predictive modeling.

  • Visual workflow design for simplicity.
  • An extensive set of pre-built operators for common tasks.
  • Machine learning and predictive modeling capabilities.
  • Licensing costs can be high for certain features.
  • RapidMiner offers a free version with limited functionalities. Commercial licenses are available for additional features and support.

Why QuestionPro is The Best Choice for Your Business?

While QuestionPro is primarily known as a survey and feedback platform, it does offer some features that can aid in the initial stages of data analysis. Here’s how QuestionPro can help in a data analysis process:

  • Survey Design and Data Collection : QuestionPro allows you to design surveys with various question types, including multiple-choice, open-ended, Likert scales, and more.

It facilitates the collection of data from respondents through online surveys, mobile surveys, email surveys, and offline surveys.

  • Data Export: You can export the collected survey data in different formats, such as Excel, CSV, or SPSS. This is essential for further analysis of external tools.
  • Basic Analysis Features: QuestionPro provides basic analysis tools within the platform, such as summary statistics, frequency distribution, and cross-tabulation.

Users can generate charts and graphs to visualize survey data directly on the platform.

  • Reporting: The platform offers to report features that allow users to create and share reports based on survey results. Customizable dashboards may be available for a quick overview of key metrics.
  • Integration: QuestionPro may integrate with other data analysis tools or platforms, enabling users to export data for more in-depth analysis using tools like Excel, SPSS, R, or Python.
  • Advanced Survey Logic: Advanced survey logic features within QuestionPro allow for dynamic question branching and skip logic, ensuring that respondents are directed to relevant questions based on their previous answers.

These top 15 data analysis tools are your companions on the journey to elevate your insights. Whether you’re a novice or an experienced data explorer, there’s a tool created for your needs. Each tool brings its unique strengths, turning data analysis into an adventure rather than a daunting task. 

Additionally, it’s worth mentioning the role of QuestionPro, a comprehensive survey and analytics platform that empowers users to gather valuable insights directly from their target audience. 

Integrating tools like QuestionPro into your data analysis toolkit allows you to explore the power of surveys and feedback to complement quantitative data analysis. It will help you better understand your audience’s preferences, behaviors, and sentiments.

So, grab your data hat, put on your analysis glasses, and let the exploration begin. Your insights are waiting to be uncovered! Happy exploring!

LEARN MORE         FREE TRIAL

MORE LIKE THIS

Net Trust Score

Net Trust Score: Tool for Measuring Trust in Organization

Sep 2, 2024

research data analysis tools

Why You Should Attend XDAY 2024

Aug 30, 2024

Alchemer vs Qualtrics

Alchemer vs Qualtrics: Find out which one you should choose

target population

Target Population: What It Is + Strategies for Targeting

Aug 29, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • What’s Coming Up
  • Workforce Intelligence
  • Statistics as Topic

Data analysis: tools and methods

  • January 2011

Zdenka Prokopová at Tomas Bata University in Zlín

  • Tomas Bata University in Zlín

Petr Silhavy at Tomas Bata University in Zlín

Abstract and Figures

Star schema Snowflake schema-in this schema are data widespread in several related tables with cardinality 1:N. Obviously are tables in third normal form. It causes restriction of redundant data but by reason of more connections between tables is decreasing the query performance.

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Antwi Owusu Agyeman

  • Aditya Shetty
  • Arnish Jain

Ritesh Dhanare

  • Hitapriya Suprayitno

Ria Soemitro

  • Sathishkumar Nachimuthu
  • Ming J. Zuo
  • Stephen Seewald
  • Connor Thicke

Manish Devgan

  • Lect Notes Comput Sci
  • IBM J RES DEV

Michael Steinbach

  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

IMAGES

  1. 10 Most Important Best Data Analysis Tools For Research

    research data analysis tools

  2. Top 30 Big Data Tools for Data Analysis

    research data analysis tools

  3. Popular statistical data analysis tools and techniques used in market

    research data analysis tools

  4. Data analysis

    research data analysis tools

  5. 7 Types Of Statistical Analysis With Best Examples

    research data analysis tools

  6. Top 14 Qualitative Data Analysis Software in 2022

    research data analysis tools

VIDEO

  1. Qualitative Research (Data Analysis and Interpretation) Video Lesson

  2. Top 3 Ai Tools for Data Analyst #ai #aitools #dataanalytics #dataanalysis #youtubeshorts

  3. Data Analysis in Research

  4. AI-Powered Data Analysis Tools for Cutting-Edge Research

  5. EP 6: Data Analysis Tools & Software

  6. data analysis tools

COMMENTS

  1. Register for the NCI Genomic Data Commons Analysis Tool Challenge

    By participating in the challenge, you will: leverage data types and formats from the GDC to develop your tool. help provide the cancer research community with new data analysis tools within the GDC.; use the GDC Analysis Tool Software Development Kit to integrate an analysis tool with the GDC data portal using data from the GDC.

  2. Learn how to use Python and/or R programming languages for data

    Real-world data can be messy. This workshop will cover a range of topics related to organizing and manipulating spreadsheet data for more effective analysis. We'll use pandas, a popular and free data analysis library written for Python. Prerequisite: Understanding of basic Python concepts (e.g., functions, operators, data types) is helpful.

  3. 5 Effective AI Tools for Market Research

    Primary research involves collecting new data directly from sources (used for UX, brand perception or product interactions). AI tools like Speak, Hotjar and Survey Monkey Genius focus on gathering primary data. On the other hand, secondary research uses existing data, which is great for competitive analysis, segmentation and market trends.

  4. Top 12 Free AI Tools for Research in 2024: Unlock ...

    Discover the best free AI tools for research in 2024! From Heuristica to ChatGPT, explore powerful AI-driven platforms that simplify data analysis, literature review, and more—without breaking the bank.

  5. PrecisionChain Opens New Opportunities for Global Collaboration in

    PrecisionChain Opens New Opportunities for Global Collaboration in Precision Medicine Research Precision medicine is an approach to healthcare that tailors medical treatment and interventions to the individual characteristics of each patient, such as their genetic makeup, environment, and lifestyle. This requires combining different data types such as clinical and genetic data.

  6. 11 Ways To Do SEO Content Research Beyond Competitor Analysis

    Take your SEO strategy to the next level with content research beyond competitor analysis. Discover how to develop unique niche strategies for long-term success.

  7. Information Systems IE&IS

    In order to do that, the IS group helps organizations to: (i) understand the business needs and value propositions and accordingly design the required business and information system architecture; (ii) design, implement, and improve the operational processes and supporting (information) systems that address the business need, and (iii) use advanced data analytics methods and techniques to ...

  8. The "Hidden" AI Tools That Are Driving Educational Gains

    AI is giving educators the data, analysis, and tools they need to be the best teachers they can be. ... Research has shown that how tutors talk and interact with students significantly affects ...

  9. States Improve How They Assess Coastal Wetlands' Impacts to Reduce

    'Report card' showed improvements for most coastal states. Led by Jim Holmquist, the center's wetland ecologist, Wolfe and fellow data technicians Rose Cheney and Henry Betts developed and updated one of SERC's defining blue carbon projects: the Coastal Carbon Atlas and Library, a digital compilation of global blue carbon data. From that data, the center in 2021 developed a state-level ...

  10. Bioinformatics advances in eccDNA identification and analysis

    It is important to note that while NGS-based tools achieved higher F1-scores, this is likely due to the larger data volume of NGS data compared to TGS data (Supplementary Table 2). TGS data ...

  11. Time trends in the epidemiology of food allergy in England: an

    IgE-mediated food allergy is the most common cause of anaphylaxis in children and adults, and can be life-threatening. 1 Data suggest that in many regions, food allergy is becoming more common. 2,3 Estimated prevalence is between 0·1% and 9·3% in children, with geographical variations that suggest a larger disease burden in higher-income countries. 3 These estimates vary considerably ...

  12. Transforming simulation in healthcare to enhance interprofessional

    Simulation in healthcare, empowered by big data analytics and artificial intelligence (AI), has the potential to drive transformative innovations towards enhanced interprofessional collaboration (IPC). This convergence of technologies revolutionizes medical education, offering healthcare professionals (HCPs) an immersive, iterative, and dynamic simulation platform for hands-on learning and ...

  13. Incidence of post-extubation dysphagia among critical care patients

    This systematic review and meta-analysis was conducted adhering to the guidelines outlined in the Joanna Briggs Institute (JBI) Reviewers' Manual and followed the principles of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses 2020 statement (PRISMA 2020) [] (see Additional file 1: Table S1).In addition, it was registered with PROSPERO under the registration number ...

  14. Gross Domestic Product (Second Estimate), Corporate Profits

    Real gross domestic product (GDP) increased at an annual rate of 3.0 percent in the second quarter of 2024 (table 1), according to the "second" estimate released by the U.S. Bureau of Economic Analysis. In the first quarter, real GDP increased 1.4 percent. The GDP estimate released today is based on more complete source data than were available for the "advance" estimate issued last month.

  15. Musketeer: a software tool for the analysis of titration data

    Musketeer is a powerful open-source software tool for the analysis of titration data, featuring a simple cross-platform graphical interface for importing data directly from UV-vis, fluorescence and NMR spectrometers, or from spreadsheets. The fast data analysis algorithm can be used to obtain equilibrium constants for simple binding isotherms ...

  16. NASA's Mini BurstCube Mission Detects Mega Blast

    The shoebox-sized BurstCube satellite has observed its first gamma-ray burst, the most powerful kind of explosion in the universe, according to a recent analysis of observations collected over the last several months. "We're excited to collect science data," said Sean Semper, BurstCube's lead engineer at NASA's Goddard Space Flight Center in Greenbelt, Maryland. "It's an ...

  17. 10 Data Analysis Tools and When to Use Them

    Whether you are part of a small or large organization, learning how to effectively utilize data analytics can help you take advantage of the wide range of data-driven benefits. 1. RapidMiner. Primary use: Data mining. RapidMiner is a comprehensive package for data mining and model development.

  18. Data Analysis Techniques In Research

    Data analysis techniques in research are categorized into qualitative and quantitative methods, each with its specific approaches and tools. These techniques are instrumental in extracting meaningful insights, patterns, and relationships from data to support informed decision-making, validate hypotheses, and derive actionable recommendations.

  19. Data Analysis in Research: Types & Methods

    Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. Three essential things occur during the data ...

  20. The Beginner's Guide to Statistical Analysis

    Statistical analysis means investigating trends, patterns, and relationships using quantitative data. It is an important research tool used by scientists, governments, businesses, and other organizations. To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process. You need to specify ...

  21. 12 Data analysis tools for qualitative research

    Tool 3: Provalis Research WordStat. Nvivo Data Analysis Tool for Qualitative Research. Provalis Research WordStat stands out as a powerful tool within the world of qualitative data analysis tools, offering unique advantages for researchers engaged in qualitative analysis: WordStat excels in text mining, providing researchers with a robust ...

  22. What Is Data Analysis? (With Examples)

    Written by Coursera Staff • Updated on Apr 19, 2024. Data analysis is the practice of working with data to glean useful information, which can then be used to make informed decisions. "It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts," Sherlock ...

  23. Basic statistical tools in research and data analysis

    The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into a lifeless data. The results and inferences are precise only if proper statistical tests are used. This article will try to acquaint the reader with the basic research tools that are utilised while conducting various studies.

  24. The 10 Best Data Analytics Tools for Data Analysts in 2024

    Tableau. Founded in 2003 at Stanford University, Tableau is a powerful and popular data visualization tool that allows you to analyze data from multiple sources simultaneously. Tableau is one of the best-in-class BI tools. It is used by top companies to extract insights from massive amounts of raw data.

  25. The 11 Best Data Analytics Tools for Data Analysts in 2024

    As a data analytics tool, it's great for showcasing work: Jupyter Notebook runs in the browser and supports over 40 languages, including Python and R. It also integrates with big data analysis tools, like Apache Spark (see below) and offers various outputs from HTML to images, videos, and more.

  26. Data Analysis Tools

    Data Analysis Tools. This course is part of Data Analysis and Interpretation Specialization. Instructors: Jen Rose +1 more • • ... Now that you have selected a data set and research question, managed your variables of interest and visualized their relationship graphically, we are ready to test those relationships statistically. ...

  27. What is Data Analysis? An Expert Guide With Examples

    Data analysis is a comprehensive method of inspecting, cleansing, transforming, and modeling data to discover useful information, draw conclusions, and support decision-making. It is a multifaceted process involving various techniques and methodologies to interpret data from various sources in different formats, both structured and unstructured.

  28. The 7 Most Useful Data Analysis Techniques [2024 Guide]

    Cluster analysis. Time series analysis. Sentiment analysis. The data analysis process. The best tools for data analysis. Key takeaways. The first six methods listed are used for quantitative data, while the last technique applies to qualitative data.

  29. Exploring Top 15 Data Analysis Tools to Elevate Your Insights

    Here are some common types of data analytics tools: Statistical Analysis Tools: Conducting statistical analyses, hypothesis testing, and regression analysis to extract insights from data. Data Visualization Tools: Creating visual representations of data through charts, graphs, and dashboards for easier interpretation.

  30. (PDF) Data analysis: tools and methods

    Content analysis is a tool authors use to structure qualitative research data collected which support and satisfy the research objectives and the data samples that could generalized to answer key ...