Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 22 additions & 2 deletions src/asm_parse.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,9 @@ using crab::number_t;
#define DOT "[.]"
#define TYPE R"_(\s*(shared|number|packet|stack|ctx|map_fd|map_fd_programs)\s*)_"

// regex to match "require [assertion1, assertion2, ...]"
#define REQUIRE R"_(\s*require\s*\[\s*(.*?)\s*\]\s*)_"

static const std::map<std::string, Bin::Op> str_to_binop = {
{"", Bin::Op::MOV}, {"+", Bin::Op::ADD}, {"-", Bin::Op::SUB}, {"*", Bin::Op::MUL},
{"/", Bin::Op::UDIV}, {"%", Bin::Op::UMOD}, {"|", Bin::Op::OR}, {"&", Bin::Op::AND},
Expand Down Expand Up @@ -133,7 +136,8 @@ static Deref deref(const std::string& width, const std::string& basereg, const s
};
}

Instruction parse_instruction(const std::string& line, const std::map<std::string, label_t>& label_name_to_label) {
InstructionOrConstraintsSet parse_instruction(const std::string& line,
const std::map<std::string, label_t>& label_name_to_label) {
Comment on lines +139 to +140
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to add an instruction (assertion) of the kind you want. Then do the check from inside ebpf_checker like any other check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think it's better to do it after #787

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to allow an external party to provide concrete state at a label and to then compare that concrete state to the abstract state for that label.

I.e. if verifier believes:
At label 4:-1 -> "r0.uvalue=0"

But execution of the BPF program results in:
At label 4:-1 -> "r0.uvalue=1"

Then this is a bug, in the VM (or possibly the verifier). This is what I am using the ebpf_check_constraints_at_label for and it has been hugely successful in finding bugs (much more than simply comparing the post-conditions after the program executes).

I am fine redoing this work once the code re-org is complete, but the goal here is to be able to reason about the program state post verification (i.e. when the program is actually executing).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the scenario. Does it happen online or offline? Do you want to query the final result of the computation during the execution of the program?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The BPF VM is single stepped through the instructions. After each instruction, a series of assertions is generated from the concrete state of the VM and checked against the abstract state at that label.

Divergence between the two signifies that either the VM is wrong or the verifier is wrong.

The majority of the verifier bugs I have been filing have been found using this technique (using BPF programs generated by specdif or libfuzzer).

// treat ";" as a comment
std::string text = line.substr(0, line.find(';'));
const size_t end = text.find_last_not_of(' ');
Expand Down Expand Up @@ -221,6 +225,16 @@ Instruction parse_instruction(const std::string& line, const std::map<std::strin
}
return res;
}
if (regex_match(text, m, regex(REQUIRE))) {
std::string constraints = m[1];
std::set<std::string> constraint_set;
std::regex re(R"(\s*,\s*)");
std::sregex_token_iterator first{constraints.begin(), constraints.end(), re, -1}, last;
for (; first != last; ++first) {
constraint_set.insert(*first);
}
return constraint_set;
}
return Undefined{0};
}

Expand All @@ -246,7 +260,13 @@ static InstructionSeq parse_program(std::istream& is) {
if (line.empty()) {
continue;
}
Instruction ins = parse_instruction(line, {});
auto ins_or_constraints = parse_instruction(line, {});

if (std::holds_alternative<ConstraintsSet>(ins_or_constraints)) {
continue;
}

Instruction ins = convert_to_original<Instruction>(ins_or_constraints).value_or(Undefined{0});
if (std::holds_alternative<Undefined>(ins)) {
continue;
}
Expand Down
21 changes: 20 additions & 1 deletion src/asm_parse.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,26 @@
#pragma once

#include <string>
#include <vector>

#include "asm_syntax.hpp"
#include "crab/interval.hpp"
#include "crab/linear_constraint.hpp"

Instruction parse_instruction(const std::string& line, const std::map<std::string, label_t>& label_name_to_label);
InstructionOrConstraintsSet parse_instruction(const std::string& line,
const std::map<std::string, label_t>& label_name_to_label);

/***
* Parse a set of string form linear constraints into a vector of crab::linear_constraint_t
*
* @param[in] constraints A set of string form linear constraints.
* @param[out] numeric_ranges A vector of numeric ranges that are extracted from the constraints.
*
* @return A vector of crab::linear_constraint_t
* Example of string constraints include:
* r0.type=number
* r0.uvalue=5
* r0.svalue=5
*/
std::vector<crab::linear_constraint_t> parse_linear_constraints(const std::set<std::string>& constraints,
std::vector<crab::interval_t>& numeric_ranges);
80 changes: 80 additions & 0 deletions src/asm_syntax.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,49 @@ struct label_t {
explicit label_t(const int index, const int to = -1, std::string stack_frame_prefix = {}) noexcept
: from(index), to(to), stack_frame_prefix(std::move(stack_frame_prefix)) {}

/**
* @brief Construct a new label t object from a string.
*
* @param[in] string_label The string representation of the label.
* @return None (constructor)
*
* @throw std::invalid_argument The label format is invalid.
* @throw std::out_of_range The label value causes numeric overflow.
*
* Format: [prefix/]from[:to]
* - prefix: Optional stack frame prefix
* - from: Source instruction number
* - to: Optional jump target (-1 means next instruction)
*
* Example labels:
* "2:-1" - a label which falls through to the next instruction.
* "2:5" - a label which jumps to instruction 5.
* "2:-1/5:-1" - a label which falls through to the next instruction, with a stack frame prefix denoting where the
* label was called.
*/
explicit label_t(std::string_view string_label) {
auto pos = string_label.find(STACK_FRAME_DELIMITER);
if (pos != std::string_view::npos) {
stack_frame_prefix = std::string(string_label.substr(0, pos));
string_label = string_label.substr(pos + 1);
}

pos = string_label.find(':');
try {
if (pos != std::string_view::npos) {
from = std::stoi(std::string(string_label.substr(0, pos)));
to = std::stoi(std::string(string_label.substr(pos + 1)));
} else {
from = std::stoi(std::string(string_label));
to = -1;
}
} catch (const std::invalid_argument& e) {
throw std::invalid_argument("Invalid label format: " + std::string(string_label));
} catch (const std::out_of_range& e) {
throw std::out_of_range("Label value out of range: " + std::string(string_label));
}
}

static label_t make_jump(const label_t& src_label, const label_t& target_label) {
return label_t{src_label.from, target_label.from, target_label.stack_frame_prefix};
}
Expand Down Expand Up @@ -315,11 +358,48 @@ struct IncrementLoopCounter {
bool operator==(const IncrementLoopCounter&) const = default;
};

// Helper metafunction to append a type to a variant
template <typename Variant, typename NewType>
struct append_to_variant;

template <typename... Types, typename NewType>
struct append_to_variant<std::variant<Types...>, NewType> {
using type = std::variant<Types..., NewType>;
};

// Helper metafunction to check if a type is in a variant
template <typename Variant, typename T>
struct is_type_in_variant;

template <typename T, typename... Types>
struct is_type_in_variant<std::variant<Types...>, T> : std::disjunction<std::is_same<T, Types>...> {};

// Function to convert ExtendedVariant to OriginalVariant if it doesn't contain the new type
template <typename OriginalVariant, typename ExtendedVariant>
std::optional<OriginalVariant> convert_to_original(const ExtendedVariant& extendedVariant) {
std::optional<OriginalVariant> result;

std::visit(
[&result](auto&& arg) {
using T = std::decay_t<decltype(arg)>;
if constexpr (is_type_in_variant<OriginalVariant, T>::value) {
result = arg;
}
},
extendedVariant);
return result;
}

using Instruction = std::variant<Undefined, Bin, Un, LoadMapFd, Call, CallLocal, Callx, Exit, Jmp, Mem, Packet, Atomic,
Assume, IncrementLoopCounter>;

using ConstraintsSet = std::set<std::string>;
using InstructionOrConstraintsSet = append_to_variant<Instruction, ConstraintsSet>::type;
using LabeledInstruction = std::tuple<label_t, Instruction, std::optional<btf_line_info_t>>;
using LabeledConstraints = std::tuple<label_t, ConstraintsSet>;
using InstructionSeq = std::vector<LabeledInstruction>;
using ConstraintsSeq = std::vector<LabeledConstraints>;
using InstructionAndConstraintsSeq = std::tuple<InstructionSeq, ConstraintsSeq>;

/// Condition check whether something is a valid size.
struct ValidSize {
Expand Down
3 changes: 3 additions & 0 deletions src/config.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@ struct ebpf_verifier_options_t {

// Print the BTF types in JSON format.
bool dump_btf_types_json = false;

// Store pre-invariants for use in ebpf_check_constraints_at_label and ebpf_get_invariants_at_label.
bool store_pre_invariants = false;
};

struct ebpf_verifier_stats_t {
Expand Down
71 changes: 71 additions & 0 deletions src/crab_verifier.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include "crab/fwd_analyzer.hpp"
#include "crab_utils/lazy_allocator.hpp"

#include "asm_parse.hpp"
#include "asm_syntax.hpp"
#include "crab_verifier.hpp"
#include "string_constraints.hpp"
Expand Down Expand Up @@ -162,6 +163,14 @@ static checks_db get_analysis_report(std::ostream& s, const cfg_t& cfg, const cr
return db;
}

static thread_local std::optional<crab::invariant_table_t> saved_pre_invariants = std::nullopt;

static void save_invariants_if_needed(const crab::invariant_table_t& pre_invariants) {
if (thread_local_options.store_pre_invariants) {
saved_pre_invariants = pre_invariants;
}
}

static checks_db get_ebpf_report(std::ostream& s, const cfg_t& cfg, program_info info,
const ebpf_verifier_options_t& options,
const std::optional<InstructionSeq>& prog = std::nullopt) {
Expand All @@ -174,6 +183,7 @@ static checks_db get_ebpf_report(std::ostream& s, const cfg_t& cfg, program_info
// Get dictionaries of pre-invariants and post-invariants for each basic block.
ebpf_domain_t entry_dom = ebpf_domain_t::setup_entry(true);
auto [pre_invariants, post_invariants] = run_forward_analyzer(cfg, std::move(entry_dom));
save_invariants_if_needed(pre_invariants);
return get_analysis_report(s, cfg, pre_invariants, post_invariants, prog);
} catch (std::runtime_error& e) {
// Convert verifier runtime_error exceptions to failure.
Expand Down Expand Up @@ -220,6 +230,7 @@ std::tuple<string_invariant, bool> ebpf_analyze_program_for_test(std::ostream& o
try {
const cfg_t cfg = prepare_cfg(prog, info, options.cfg_opts);
auto [pre_invariants, post_invariants] = run_forward_analyzer(cfg, std::move(entry_inv));
save_invariants_if_needed(pre_invariants);
const checks_db report = get_analysis_report(std::cerr, cfg, pre_invariants, post_invariants);
print_report(os, report, prog, false);

Expand Down Expand Up @@ -262,4 +273,64 @@ void ebpf_verifier_clear_thread_local_state() {
global_program_info.clear();
crab::domains::clear_thread_local_state();
crab::domains::SplitDBM::clear_thread_local_state();
saved_pre_invariants = std::nullopt;
}

bool ebpf_check_constraints_at_label(std::ostream& os, const std::string& label_string,
const std::set<std::string>& constraints) {
try {
label_t label = label_t(label_string);
if (!saved_pre_invariants.has_value()) {
os << "No pre-invariants available\n";
return false;
}
if (saved_pre_invariants.value().find(label) == saved_pre_invariants.value().end()) {
os << "No pre-invariants available for label " << label << "\n";
return false;
}
ebpf_domain_t from_inv(saved_pre_invariants.value().at(label));
auto concrete_domain = ebpf_domain_t::from_constraints(constraints, false);

if (concrete_domain.is_bottom()) {
os << "The provided constraints are unsatisfiable and self-contradictory (concrete domain is bottom)\n";
os << concrete_domain << "\n";
return false;
}
if (from_inv.is_bottom()) {
os << "The abstract state is unreachable\n";
os << from_inv << "\n";
return false;
}

if ((from_inv & concrete_domain).is_bottom()) {
os << "Concrete state does not match invariant\n";

// Print the concrete state
os << "--- Concrete state ---\n";
os << concrete_domain << "\n";

os << "--- Abstract state ---\n";
os << from_inv << "\n";

return false;
}

return true;
} catch (std::exception& e) {
os << "Error occurred while checking constraints: " << e.what() << "\n";
return false;
}
}

std::set<std::string> ebpf_get_invariants_at_label(const std::string& label) {
// If the label is malformed, throw an exception so the caller can handle it.
label_t l = label_t(label);

if (!saved_pre_invariants.has_value()) {
return {};
}
if (saved_pre_invariants.value().find(l) == saved_pre_invariants.value().end()) {
return {};
}
return saved_pre_invariants.value().at(l).to_set().value();
}
39 changes: 39 additions & 0 deletions src/crab_verifier.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,42 @@ int create_map_crab(const EbpfMapType& map_type, uint32_t key_size, uint32_t val
EbpfMapDescriptor* find_map_descriptor(int map_fd);

void ebpf_verifier_clear_thread_local_state();

/**
* @brief Given a label and a set of concrete constraints, check if the concrete constraints match the abstract
* verifier constraints at the label. Requires the `store_pre_invariants` option to be set.
*
* Abstract constraints are computed by the verifier and stored if the `store_pre_invariants` option is set.
* These constraints represent the program state at a specific point in the control flow graph,
* as determined by the static analysis performed by the verifier.
*
* If the 'store_pre_invariants' option is not set, this function will always return false along with an error message.
* This is because the verifier did not store the abstract constraints at each label.
*
* For invalid labels, this function will return false along with an error message.
*
* @param[in,out] os Print output to this stream.
* @param[in] label The location in the CFG to check against.
* @param[in] constraints The concrete state to check.
* @retval true The state is valid.
* @retval false The state is invalid.
*
* Note:
* This can also return false if the label is not found in the CFG or if the label is malformed.
*/
bool ebpf_check_constraints_at_label(std::ostream& os, const std::string& label,
const std::set<std::string>& constraints);
/**
* @brief Get the invariants at a given label. Requires the `store_pre_invariants` option to be set.
*
* If the 'store_pre_invariants' option is not set, this function will return an empty set
* as no invariants were stored during verification.
*
* @param[in] label The label in the CFG where invariants should be retrieved
* @return The set of invariants at the given label.
* Each invariant represents a constraint on the program state at this point.
* Returns an empty set if no invariants are available.
* @throw std::invalid_argument The label format is invalid
* @throw std::out_of_range The label value causes numeric overflow
*/
std::set<std::string> ebpf_get_invariants_at_label(const std::string& label);
Loading
Loading