Skip to content

The Spear of Knowledge

An AI agent for book highlights with embeddings

I extended Quuordinates to be "agentic" of sorts. I call it a spear because I imagine it being like throwing a spear into the embedding space, like spear fishing in a shallow body of water.

From an initial query:

  1. Generate five subquestions
  2. Get highest matching similarity quotes from embeddings DB
  3. Create a smart filtered list of text and results, only choosing relevant quotes
  4. Stitch the results into a final summary

Here's a visual of the process:

  1. user types in simple query like "what is the effect of open source on developers making money"
  2. GPT-4 creates five similar but unique sub questions (about open source and finances in this case)
  3. Each question is searched through the embedding space
  4. Duplicate results are removed and a GPT-4 smart filter determines if the quote is relevant to the user query (how proximate is the quote retrieved to the original query)
  5. Filtered results are stitched together into one paragraph with book citations

Check the results out!

Examples of Results

Crabs

I haven't highlighted any particular crustacean related quotes, so the spear successfully "gets stuck".

(quoordinates-py3.9) bram@macbook-pro quoordinates % node spear.js what are crabs  

Running query: what are crabs  

Queries:
[
  '"What are the different species of crabs and how do they typically differ from one another?"',
  '"What are the unique anatomical characteristics of crabs that separate them from other crustaceans?"',
  '"How do crabs adapt to their respective habitats and what are their roles in the ecosystem?"',
  '"What are the major threats to crab populations and how do they impact the overall marine biodiversity?"',
  '"How do human activities affect the life cycle and population of crabs?"'
]

Filtered Results (length = 0):
[]

Stitched Results:
Without the provided quotes, I'm unable to create a coherent paragraph for you. Can you please provide the quotes and their corresponding books?

Open Source

3 matches and a coherent spear result about the Tragedy of the Commons! (Plaintext quote result from GPT below the code block)

(quoordinates-py3.9) bram@macbook-pro quoordinates % node spear.js what is the affect of open source on developers making money

Running query: what is the affect of open source on developers making money

Queries:
[
  "Why might open source platforms possibly affect developers' ability to generate revenue?",
  "How does the availability of free code-resources in open source projects impact the overall valuation of developers' work?",
  'How does the open source model challenge traditional profit-making models of software development?',
  'Why might the act of monetization become more complicated for developers once they decide to contribute to open-source projects?',
  'In what ways does the increased collaboration and competition in open source communities influence the income potential for individual developers?'
]

Filtered Results (length = 3):
[
  '> While most open-source developers do not intrinsically object to others profiting from their gifts, most also demand that no party (with the possible exception of the originator of a piece of code) be in a privileged position to extract profits.\n' +
    '\n' +
    '--<cite>The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary</cite>\n' +
    '\n',
  '> Open source code itself is not a common pool resource but a positive externality of its underlying contributor community. Users can consume, or “appropriate,” code at zero marginal cost, because what the commons actually manages is not code but attention. When developers make contributions, they appropriate this attention from the commons.\n' +
    '\n' +
    '--<cite>Working in Public: The Making and Maintenance of Open Source Software</cite>\n' +
    '\n',
  '> Clearly there is some critical way in which the “Tragedy of the Commons” model fails to capture what is actually going on. Part of the answer certainly lies in the fact that using software does not decrease its value. Indeed, widespread use of open-source software tends to increase its value, as users fold in their own fixes and features (code patches). In this inverse commons, the grass grows taller when it’s grazed upon. That this public good cannot be degraded by overuse takes care of half of Hardin’s tragedy, the congested-public-goods problem. It doesn’t explain why open source doesn’t suffer from underprovision. Why don’t people who know the open-source community exists universally exhibit free-rider behavior behavior, waiting for others to do the work they need, or (if they do the work themselves) not bothering to contribute the work back into the commons? Part of the answer lies in the fact that people don’t merely need solutions, they need solutions on time. It’s seldom possible to predict when someone else will finish a given piece of needed work. If the payoff from fixing a bug or adding a feature is sufficient to any potential contributor, that person will dive in and do it (at which point the fact that everyone else is a free rider becomes irrelevant).\n' +
    '\n' +
    '--<cite>The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary</cite>\n' +
    '\n'
]

Stitched Results:
Open-source software serves as a paradoxical model that refutes the 'Tragedy of the Commons', wherein the value of the common resource doesn't diminish with overuse but rather gets enhanced, as suggested by The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. 
Widespread usage generates bug fixes and feature additions - essentially, the software 'grass' grows taller with more 'grazing.' 
The model further withstands the common-pool-resource challenge, the fear of under-provisioning due to free-rider behavior is mitigated by the demand for time-bound solutions. 
An individual contributor, driven by an immediate need for a bug fix or a feature addition, would willingly contribute and the collective free-rider behavior becomes moot. In this context, Working in Public: The Making and Maintenance of Open Source Software points out that users can appropriate, or utilize, the code at no extra cost. 
But this is not exploitation, it's a form of shared ownership, since the real common resource isn't the code, but the developer attention it receives and requires. However, there remains a universal agreement within the community that no party, except possibly the code's originator, should gain undue profits. 
The economic aspect of open-source development is thus fundamentally different, with equity and the dynamism of the code and its community at its core, rather than monetary gains.
Open-source software serves as a paradoxical model that refutes the 'Tragedy of the Commons', wherein the value of the common resource doesn't diminish with overuse but rather gets enhanced, as suggested by The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. Widespread usage generates bug fixes and feature additions - essentially, the software 'grass' grows taller with more 'grazing.' The model further withstands the common-pool-resource challenge, the fear of under-provisioning due to free-rider behavior is mitigated by the demand for time-bound solutions. An individual contributor, driven by an immediate need for a bug fix or a feature addition, would willingly contribute and the collective free-rider behavior becomes moot. In this context, Working in Public: The Making and Maintenance of Open Source Software points out that users can appropriate, or utilize, the code at no extra cost. But this is not exploitation, it's a form of shared ownership, since the real common resource isn't the code, but the developer attention it receives and requires. However, there remains a universal agreement within the community that no party, except possibly the code's originator, should gain undue profits. The economic aspect of open-source development is thus fundamentally different, with equity and the dynamism of the code and its community at its core, rather than monetary gains.

Equanimity

Another example of the spear getting "stuck" early, and continuing to form a cohesive argument.

(quoordinates-py3.9) bram@macbook-pro quoordinates % node spear.js what must we do to be equanimous      

Running query: what must we do to be equanimous

Queries:
[
  'Why is equanimity considered crucial for personal growth and inner peace?',
  'How does maintaining a balanced mind in different situations contribute to our overall wellbeing, invoking the need for equanimity?',
  'What are the various external factors or situations that often challenge our equanimity?',
  'Why is it difficult to achieve equanimity despite knowing its importance?',
  'What are the different strategies or life changes that can guide us towards achieving equanimity, and why are they effective?'
]

Filtered Results (length = 3)
[
  '> There is, however, still another, even more powerfully significant way in which the acceleration of change in society increases the difficulty of coping with life. This stems from the fantastic intrusion of novelty, newness into our existence. Each situation is unique. But situations often resemble one another. This, in fact, is what makes it possible to learn from experience. If each situation were wholly novel, without some resemblance to previously experienced situations, our ability to cope would be hopelessly crippled.\n' +
    '\n' +
    '--<cite>Future Shock</cite>\n' +
    '\n',
  '> PRACTICING STOICISM won’t be easy. It will take effort, for example, to practice negative visualization, and practicing self-denial will take more effort still. It will take both effort and willpower to abandon our old goals, such as the attainment of fame and fortune, and replace them with a new goal, namely, the attainment of tranquility.\n' +
    '\n' +
    '--<cite>A Guide to the Good Life: The Ancient Art of Stoic Joy</cite>\n' +
    '\n',
  `> "Some people," he says, "achieve a certain sense of serenity, even in the midst of turmoil, not because they are immune to emotion, but because they have found ways to get just the 'right' amount of change in their lives." The search for that optimum may be what much of the "pursuit of happiness" is about. Trapped, temporarily, with the limited nervous and endocrine systems given us by evolution, we must work out new tactics to help us regulate the stimulation to which we subject ourselves.\n` +
    '\n' +
    '--<cite>Future Shock</cite>\n' +
    '\n'
]

Stitched Results:
The escalating pace of societal change presents challenges to maintaining an equanimous state of mind.
As described in "Future Shock," the continual inundation of novelty and newness in our lives makes each situation unique. 
While these situations often have some resemblance to past experiences, which allows us to draw from those previous encounters, the rapid acceleration of change can exacerbate coping mechanisms. To thrive amidst such rapid changes, individuals must find the right balance of change in their lives. 
Even amid turbulence, some achieve serenity not through insensitivity to emotions, but by creating strategies to regulate the stimuli to which they expose themselves. 
This pursuit of an optimum state could potentially encapsulate a significant portion of our chase for happiness. "A Guide to the Good Life: The Ancient Art of Stoic Joy" suggests engaging in the practice of Stoicism as a method to achieve tranquility amidst continued societal changes. This process involves substantial effort, including the enactment of negative visualization and self-denial, and a substantial shift in goals from the acquisition of fame and fortune to the pursuit of tranquility. 
Serving as a bridge between old and new experiences, Stoicism may aid in defining new tactics for regulating stimulation, potentially achieving a desired equanimity.
The escalating pace of societal change presents challenges to maintaining an equanimous state of mind. As described in "Future Shock," the continual inundation of novelty and newness in our lives makes each situation unique. While these situations often have some resemblance to past experiences, which allows us to draw from those previous encounters, the rapid acceleration of change can exacerbate coping mechanisms. To thrive amidst such rapid changes, individuals must find the right balance of change in their lives. Even amid turbulence, some achieve serenity not through insensitivity to emotions, but by creating strategies to regulate the stimuli to which they expose themselves. This pursuit of an optimum state could potentially encapsulate a significant portion of our chase for happiness. "A Guide to the Good Life: The Ancient Art of Stoic Joy" suggests engaging in the practice of Stoicism as a method to achieve tranquility amidst continued societal changes. This process involves substantial effort, including the enactment of negative visualization and self-denial, and a substantial shift in goals from the acquisition of fame and fortune to the pursuit of tranquility. Serving as a bridge between old and new experiences, Stoicism may aid in defining new tactics for regulating stimulation, potentially achieving a desired equanimity.

Code

(github)

import { Configuration, OpenAIApi } from "openai";
import dotenv from "dotenv";
import { similaritySearch } from "./similarity-search.js";

dotenv.config();

const configuration = new Configuration({
  apiKey: process.env.OPENAI_API_KEY,
  organization: process.env.OPENAI_ORG,
});

const openai = new OpenAIApi(configuration);

const fiveWhysPrompt = (query) =>
  `Given the query: "${query}", generate five whys into the query. Five deeper questions that use the former questions, basically. I only need the questions, not the answers.`;
const isRelevantPromptExplain = (query, text) =>
  `Given the query: "${query}", is the following text relevant? "${text}". Yes or no? Explain why.`;
const isRelevantPrompt = (query, text) =>
  `Given the query: "${query}", is the following text relevant? "${text}". Yes or no? Only answer yes if the text is relevant to the query. If the text is not relevant, answer no.`;
const stitchResultsPrompt = (query, results) =>
  `You are a PHD student (so your knowledge is very good, and you have a great grasp on grammar and understanding relationships between concepts) tasked with the following: Using the following quotes understand their meaning and link them together into a one paragraph coherent summary. Re-arrange them in any way that makes the most coherent point from all of them combined, telling a story of sorts, just explain the logic that connects them: ${results.join(
    " "
  )}. Remember you do not have to use the quotes in the order they were given and might be better switching them around. Here is the query to frame your answer around: "${query}". Make sure to think it out and read each quote carefully.`;

const getCompletion = async (prompt, maxTokens = 64) => {
  const completion = await openai.createChatCompletion({
    model: "gpt-3.5-turbo",
    messages: [{ role: "user", content: prompt }],
  });
  return completion.data.choices[0].message.content;
};

const spear = async (query) => {
  console.log(`Running query: ${query}`);
  const prompt = fiveWhysPrompt(query);
  const completion = await getCompletion(prompt);
  const queries = completion
    .split("\n")
    .filter((line) => line.length > 0)
    .map((line) => line.slice(2).trim());
  console.log(queries);
  // Promise.all similaritySearch for each query
  const results = await Promise.all(
    queries.map(async (query) => ({
      query,
      results: await similaritySearch(query),
    }))
  );

  let setOfQuotes = new Set();
  const topResults = results.map((result) => {
    let topResult = result.results[0];
    if (setOfQuotes.has(topResult.text)) {
      topResult = result.results[1];
    }
    if (setOfQuotes.has(topResult.text)) {
      topResult = result.results[2];
    }
    if (setOfQuotes.has(topResult.text)) {
      return "";
    }

    setOfQuotes.add(topResult.text);
    return `> ${topResult.text}\n\n--<cite>${topResult.title}</cite>\n\n`;
  });

  const filterResultsByRelevance = async (queries, results) => {
    const filteredResults = [];
    for (let i = 0; i < results.length; i++) {
      const prompt = isRelevantPrompt(queries[i], results[i]);
      const completion = await getCompletion(prompt);
      if (completion.toLowerCase() === "yes") {
        filteredResults.push(results[i]);
      }
    }
    return filteredResults;
  };
  const filteredResults = await filterResultsByRelevance(queries, topResults);
  console.log(filteredResults);
  console.log(filteredResults.length);
  if (filteredResults.length < 1) {
    console.log("Not enough results to stitch together.");
    return;
  }
  const stitchedResults = await getCompletion(
    stitchResultsPrompt(query, filteredResults)
  );
  console.log(stitchedResults);
};

const main = async () => {
  const query = process.argv.slice(2).join(" ");
  await spear(query);
};

main();

bramadams.dev is a reader-supported published Zettelkasten. Both free and paid subscriptions are available. If you want to support my work, the best way is by taking out a paid subscription.

Comments