7 Matching Annotations
  1. Dec 2024
    1. Thismethod systematically explores the repository knowledge graphand prioritizes the discovery of critical information such as reposi-tory functions and dependency structures that have a greater impacton resolving issues. By simulating multiple trajectories and evaluat-ing their importance, MCTS dynamically narrows the search spaceand focuses computational resources on the most relevant areas.

      该方法系统地探索存储库知识图谱,找出对解决问题影响较大的关键信息,比如代码库功能点和依赖结构,并对结果进行重要性排序。 通过模拟多个轨迹并评估其重要性,MCTS 动态缩小搜索空间并将计算资源集中在最相关的区域。

    2. Excessive reference relationships may increase the complexity ofthe graph structure and affect the analysis efficiency and accu-racy of the model.

      过多的引用关系可能会增加图结构的复杂性,影响模型的分析效率和准确性

    1. The main role of SBFL-identified methods is to reveal more “hints”on relevant classes and methods beyond those mentioned in theproblem statement. The LLM agent can then use the context re-trieval APIs to examine these methods. Since the SBFL-identifiedmethods are presented to the agent together with the problemstatements, the agent can then cross-reference between these twosources of information.

      SBFL 识别方法的主要作用是揭示问题陈述中提到的相关类和方法之外的更多“提示”。 然后,大语言模型代理可以使用上下文检索 API 来检查这些方法。 由于 SBFL 识别的方法与问题陈述一起呈现给代理,因此代理可以在这两个信息源之间交叉引用。

  2. Nov 2024
    1. To this end, we develop a novel ASE method namedRepoUnderstander by guiding agents to comprehensively under-stand the whole repositories. Specifically, we first condense thecritical information of the whole repository into the repositoryknowledge graph in a top-to-down mode to decrease the complex-ity of repository. Subsequently, we empower the agents the abilityof understanding whole repository by proposing a Monte Carlotree search based repository exploration strategy. In addition, tobetter utilize the repository-level knowledge, we guide the agents tosummarize, analyze, and plan. Then, they can manipulate the toolsto dynamically acquire information and generate the patches tosolve the real-world GitHub issues.

      论文提出了一种名为RepoUnderstander的新颖方法,该方法指导代理通过以下几个步骤来全面理解整个代码仓库:

      • 构建代码仓库知识图谱:通过自上而下的方式将整个仓库的关键信息压缩成知识图谱,以降低仓库的复杂性。
      • 基于蒙特卡洛树搜索的仓库探索策略:赋予代理理解整个仓库的能力,通过模拟多种路径并评估它们的奖励分数,逐步缩小搜索空间,引导代理关注最相关的区域。
      • 信息利用与补丁生成:指导代理总结、分析和规划,然后操作工具动态获取信息并生成解决现实世界GitHub问题的补丁。
  3. Oct 2024
    1. To identify the essential code elements needed to com-plete the given infilling method m in a repository, a naivesolution might scan the entire codebase for all accessibleelements, which would introduce excessive noise. Anotherapproach could focus on methods with similar signatures orcontexts; however, these often provide irrelevant elementsthat do not serve m’s functional purpose, leading to redun-dancy and missing critical elements.

      problematic methods

    1. pruning the specific implementations of func-tions in all dependent files does not signifi-cantly reduce the accuracy of completions

      这不是很显然的吗?

    1. Greedy Selection. Retrieval is performed if<cc> is the most likely token following <eof>.• Threshold Selection. If the probability of <cc>

      greedy: 只要<cc>的概率最大即可,不管这个概率是多少。

      threshold: <cc> 的概率要达到一定的门槛