P Value Calculator Online Free Tool

    P-value Calculator

    Compute p-value from z-score for a normal distribution

    P-values

    P-value (x < Z, left tail)
    0.000000
    P-value (x > Z, right tail)
    0.000000
    P-value (0 to Z or Z to 0, from center)
    0.000000
    P-value (-Z < x < Z, between)
    0.000000
    P-value (x < -Z or x > Z, two tails)
    0.000000

    A p-value is the probability of observing results at least as extreme as those obtained, assuming the null hypothesis is true. It is the cornerstone of statistical hypothesis testing. This calculator converts a test statistic (z-score, t-score, F-statistic, or chi-square) into a p-value for one-tailed and two-tailed tests.

    Understanding P-values

    A small p-value means the data are unlikely under the null hypothesis, suggesting evidence against it. A large p-value means the data are consistent with the null hypothesis. The significance threshold α (typically 0.05) determines when we "reject" the null hypothesis.

    P-valueInterpretation
    p < 0.001Very strong evidence against null hypothesis
    0.001 ≤ p < 0.01Strong evidence
    0.01 ≤ p < 0.05Moderate evidence (often considered significant)
    0.05 ≤ p < 0.1Weak evidence (often borderline)
    p ≥ 0.1Insufficient evidence to reject null hypothesis

    One-tailed vs Two-tailed Tests

    A two-tailed test tests for a difference in either direction (greater or less than). A one-tailed test tests for a difference in a specific direction. For the same data, a one-tailed p-value is half the two-tailed p-value. Use a one-tailed test only when you had a clear directional hypothesis before collecting data.

    Two-tailed p = 2 × P(Z > |z|) One-tailed p = P(Z > z) or P(Z < z)

    Frequently Asked Questions

    Does p < 0.05 prove the null hypothesis is false?

    No. A p-value only tells you the probability of the data given the null hypothesis, not the probability that the null hypothesis is false. Statistical significance does not prove causation, does not measure effect size, and does not guarantee practical importance. A very large sample can find "statistically significant" differences that are trivially small in practice.

    What is the p-value of 0.05 threshold and why is it used?

    The 0.05 threshold was proposed by Ronald Fisher in the 1920s as a convenient benchmark. There is nothing mathematically special about it. In some fields (physics, genomics), much stricter thresholds (0.001 or even 5×10⁻⁷) are required. In exploratory research, 0.1 may be acceptable. The threshold should be chosen based on the cost of Type I vs Type II errors in your specific application.

    What is the difference between Type I and Type II error?

    Type I error (false positive): Rejecting a true null hypothesis. Probability = α (significance level). Type II error (false negative): Failing to reject a false null hypothesis. Probability = β. Statistical power = 1 - β. There is a tradeoff: reducing α (being more conservative) increases the chance of Type II error.

    What is p-hacking?

    P-hacking is the practice of manipulating analyses, sample collection, or reporting to achieve a p < 0.05 result. Running many tests and reporting only the significant ones, adding data points until significant, or trying different subgroup analyses all inflate the chance of false positives. Pre-registration of hypotheses and analysis plans is the primary defense against p-hacking.