工具概述

工具名称：两组/多组比较横向柱形图

本工具用于分析微生物组数据中的差异特征（如OTU、ASV、物种或功能基因），支持两组或多组比较。它提供了多种统计检验方法，并对结果进行多重检验校正，同时生成直观的可视化图表。

主要功能

数据预处理：过滤低检出率特征、零值处理、对数转换
两组比较：使用Mann-Whitney U检验或t检验
多组比较：使用Kruskal-Wallis检验或ANOVA
多重检验校正：FDR校正（Benjamini-Hochberg）或其他方法
效应量计算：Hedges' g（组合效应量）和log2FC（倍数变化）
可视化：生成包含均值、差异、效应量和倍数变化的综合图表

输入数据格式

丰度表（input_file）

行是特征（如微生物物种），列是样本。应为制表符分隔的文本文件。

# 示例丰度表格式
Feature   Sample1   Sample2   Sample3   Sample4
SpeciesA   0.15     0.22     0.08     0.17
SpeciesB   0.08     0.05     0.12     0.09
SpeciesC   0.21     0.18     0.25     0.16
...

元数据表（map_file）

包含样本的分组信息，第一列为样本ID，应与丰度表中的列名对应。

# 示例元数据表格式
#SampleID   Group   OtherMetadata
Sample1     Control  ...
Sample2     Control  ...
Sample3     Experimental  ...
Sample4     Experimental  ...
...

分析流程详解

1. 数据预处理

工具首先对丰度表进行预处理：

过滤低检出率特征：默认检出率低于20%的特征被过滤
零值处理：可选择用非零最小值的一半填充零值
对数转换：可选择进行log10转换以减少数据偏斜

2. 统计分析

根据组数选择合适的统计方法：

比较类型	默认统计方法	备选方法	适用条件
两组比较	Mann-Whitney U检验	t检验	非正态分布数据
多组比较	Kruskal-Wallis检验	ANOVA	多组间整体差异检验

3. 多重检验校正

为控制假阳性率，工具默认使用FDR（False Discovery Rate）校正：

fdr_bh：Benjamini-Hochberg方法（默认）
bonferroni：更保守的校正方法
none：不进行校正

4. 效应量计算

除了p值，工具还计算两种效应量：

Hedges' g：标准化均值差异，适用于样本量不等的情况
log2FC：倍数变化的对数，表示相对丰度变化

结果解读

输出文件

分析完成后，会生成以下文件：

two_group_results.xls 或 multi_group_results.xls：包含所有统计结果的表格
two_group_comparison.svg 或 multi_group_barplot.svg：可视化图表

两组比较结果表解读

两组比较结果表格包含以下列：

列名	含义	解读指南
feature	特征名称	差异显著的微生物或功能特征
Experimental, Control	各组平均丰度	特征在实验组和对照组中的平均相对丰度
pvalue	原始p值	统计检验的原始显著性水平
padj	校正后p值	经多重检验校正后的p值（更可靠）
diff	均值差异	实验组 - 对照组的平均丰度绝对差异
ci_low, ci_high	95%置信区间	差异估计的不确定性范围
log2FoldChange	对数倍数变化	log2(实验组/对照组)，正值为实验组中上调，负值为对照组中上调
effect_size	效应量（Hedges' g）	标准化的组间差异大小

两组比较结果图解读

两组比较可视化图表包含四个主要部分：

图1: 两组比较结果图示例 - 颜色由用户指定

平均丰度条形图（左）：显示特征在实验组和对照组中的平均相对丰度，颜色由用户指定
差异图（中左）：显示组间差异及95%置信区间（误差线），点颜色表示高丰度组
效应量（中右）：Hedges' g值，点位置离0越远效应越大
倍数变化（右）：log2FC值，正值表示在实验组中富集，负值表示在对照组中富集

解读提示：关注校正后p值<0.05的特征，同时考虑效应量大小。

多组比较结果表解读

多组比较结果表格包含以下列：

列名	含义	解读指南
feature	特征名称	差异显著的微生物或功能特征
Group1, Group2, ...	各组平均丰度	特征在各组中的平均相对丰度
pvalue	原始p值	Kruskal-Wallis检验的原始显著性水平
padj	校正后p值	经多重检验校正后的p值

多组比较结果图解读

多组比较可视化图表展示各特征在不同组中的平均丰度：

图2: 多组比较结果图示例 - 颜色由用户指定

多组平均丰度条形图：每个特征对应一组条形，不同颜色代表不同组别（颜色由用户指定）
显著性标记：右侧显示校正后p值及显著性水平
图例：说明各颜色对应的组别

注意：多组比较仅检验组间整体差异

Scientific Paper Writing Guide (科研论文写作指南)

Two-Group Comparison Writing Template

Methods Section

To identify differentially abundant microbial features between groups, we performed statistical analysis using the "Two-Group Comparison Horizontal Bar Chart" tool. The raw abundance table was preprocessed by filtering features with prevalence below 20%, filling zeros with half of the minimum non-zero value, and applying log10 transformation to reduce skewness.

Differential abundance between the experimental and control groups was assessed using the Mann-Whitney U test. To control for false discovery rate (FDR) in multiple testing, we applied the Benjamini-Hochberg correction. Effect sizes were calculated as Hedges' g and log2 fold change (log2FC) to quantify the magnitude of differences. Statistical significance was defined as FDR-adjusted p-value < 0.05.

Results Section

Comparative analysis revealed [X] significantly differentially abundant microbial features between the experimental and control groups (FDR < 0.05). Among these, [Y] features were significantly enriched in the experimental group (e.g., [Feature1], log2FC = [value], p_adj = [value]), while [Z] features were enriched in the control group (e.g., [Feature2], log2FC = [value], p_adj = [value]). Effect size analysis indicated that [Feature3] showed the largest between-group difference (Hedges' g = [value]).

Figure Legend

Figure X. Differentially abundant microbial features between experimental and control groups.

(A) Horizontal bar chart showing mean relative abundance of features in the experimental group (user-defined color) and control group (user-defined color).

(B) Difference plot displaying between-group differences with 95% confidence intervals (error bars).

(C) Effect size (Hedges' g) plot, with points farther from zero indicating larger effect sizes.

(D) Fold change (log2FC) plot, with positive values indicating enrichment in the experimental group.

FDR-adjusted p-values and significance levels are shown on the right (*p < 0.05, **p < 0.01, ***p < 0.001).

Multi-Group Comparison Writing Template

Methods Section

To identify differentially abundant microbial features across multiple groups, we employed the "Multi-Group Comparison Horizontal Bar Chart" tool. Data preprocessing included filtering features with prevalence below 20%, zero-value imputation, and log10 transformation.

Global differences among groups were tested using the Kruskal-Wallis test, with FDR adjustment using the Benjamini-Hochberg method. Statistical significance was defined as FDR-adjusted p-value < 0.05.

Results Section

The Kruskal-Wallis test identified [X] microbial features with significant abundance differences across [GroupA], [GroupB], and [GroupC] (FDR < 0.05). Specifically, [Feature1] showed highest abundance in [GroupA] (mean abundance = [value]), while [Feature2] was most abundant in [GroupC] (mean abundance = [value]). These differentially abundant features may reflect group-specific microbial composition patterns.

Figure Legend

Figure X. Differentially abundant microbial features across multiple groups.

Horizontal bar chart displaying mean relative abundance of the top [N] most significant features across groups, with different colors representing different groups (colors user-defined). FDR-adjusted p-values from Kruskal-Wallis test and significance levels are shown on the right (*p < 0.05, **p < 0.01, ***p < 0.001).

Discussion Section Guidance

The significant enrichment of [FeatureX] in the experimental group may be associated with [biological process]. Previous studies have reported similar patterns of [FeatureX] abundance under [similar conditions] [citation]. Additionally, the enrichment of [FeatureY] in the control group might reflect [adaptive response]. These differentially abundant microbial features could serve as potential biomarkers worthy of further investigation into their functional significance.

Note: The discussion should integrate specific research context and existing literature to explain the biological significance of differential features.

注意事项与最佳实践

重要注意事项：

确保输入数据格式正确，样本ID在丰度表和元数据表中完全匹配
考虑样本量对统计检验效能的影响，小样本量可能无法检测到真实差异
谨慎解释边缘显著的结果（如p_adj=0.04-0.05），可能需要进一步验证
效应量比p值更能反映生物学意义，应同时考虑两者
多重检验校正虽然控制假阳性，但可能增加假阴性，需平衡二者
图表中的颜色由用户指定，在论文中应明确说明各颜色代表的组别

最佳实践建议：

在论文中明确报告使用的统计方法、校正方法和显著性阈值
同时报告p值和效应量，提供更完整的结果解读
对关键发现进行实验验证或独立队列验证
考虑微生物组数据的组成性特点，可结合其他分析方法（如ANCOM等）
在补充材料中提供完整的差异分析结果表格
在图表说明中明确标注各颜色代表的实验组和对照组

参考文献

Yunyun Gao, Guoxing Zhang, Shunyao Jiang, Yong-Xin Liu. 2024. Wekemo Bioincloud: A user-friendly platform for meta-omics data analyses. iMeta 3: e175. https://doi.org/10.1002/imt2.175
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B, 57(1), 289-300.
McKnight, P. E., & Najab, J. (2010). Mann-Whitney U Test. In The Corsini Encyclopedia of Psychology (pp. 1-1).
Hedges, L. V. (1981). Distribution theory for Glass's estimator of effect size and related estimators. Journal of Educational Statistics, 6(2), 107-128.