Automating CBSE Exam Grading with AI

Impact

  • 60 percent reduction in grading time, giving teachers more time to teach and mentor.
  • More consistent evaluation across students and graders, improving fairness and transparency.
  • Actionable feedback for students, showing where marks were lost and how to improve.

Client overview

Arivihan is an edtech company focused on improving education outcomes in India. They set out to modernize how CBSE board exam style answers and mock tests are evaluated by automating subjective grading and feedback.

The problem

CBSE style grading is high effort and hard to scale:
  • Subjective answers take time to evaluate, especially at school scale.
  • Inconsistency is common, with different evaluators awarding different marks for similar answers.
  • Growing test volume makes manual grading a bottleneck for schools and coaching programs.
Arivihan needed a system that could grade consistently against a marking scheme, at scale, while still giving useful feedback.

Goals

  • Build an AI powered grader for CBSE board exams and mock tests.
  • Ensure grading is consistent and fair, aligned to a predefined marking scheme.
  • Generate detailed, student-friendly feedback that explains deductions and improvement steps.
  • Integrate cleanly into Arivihan’s existing platform via APIs.

The solution

Krazimo built a scalable AI grading system that takes in the question, expected answer structure, and marking scheme, then evaluates student responses to produce both marks and feedback. Key components:
  • Marking scheme based grading: Evaluates subjective answers against defined criteria, not vague similarity.
  • Deduction explanations: Highlights where marks were lost and why.
  • Personalized improvement guidance: Actionable suggestions aligned to the rubric.
  • Reporting: Detailed student and teacher reports to track performance and identify common misconceptions.
  • Integration APIs: Designed for drop-in use inside Arivihan’s edtech workflows.
CBSE answer checking , CBSE paper checking , subjective answer checking , AI exam grading , automatic grading , online answer evaluation , rubric based grading , student feedback, teacher grading tool CBSE marking scheme

Architecture overview

  • Ingestion layer: Accepts questions, answer keys, marking schemes, and student responses.
  • Grading engine: Applies transformer-based NLP models fine-tuned for subjective grading, guided by the rubric and expected points.
  • Feedback generator: Produces structured feedback mapped to rubric dimensions (what was missing, what was incorrect, what to do next).
  • Reporting layer: Aggregates results for student reports, teacher dashboards, and class-level insights.
  • API layer: FastAPI endpoints for submission, grading, report retrieval, and analytics.
  • Storage and execution: AWS S3 for secure storage of inputs and outputs; AWS Lambda for scalable, serverless execution.

Implementation snapshot

  • Backend: Python with FastAPI
  • Execution: AWS Lambda
  • Storage: AWS S3
  • Modeling approach: Transformer-based NLP models fine-tuned for CBSE-style subjective grading
  • Delivery timeline: 4 months

Outcome

The AI grader significantly improved Arivihan’s evaluation workflow:
  • Grading time dropped by about 60 percent.
  • Evaluation became more consistent across students and test cycles.
  • Students received clearer, more actionable feedback to improve future answers.
This project shows how AI can modernize education workflows when it is tied to a clear rubric and designed for scale. For Arivihan, the result was faster grading, fairer evaluation, and better feedback—without increasing teacher workload.  

Protecting Your Intellectual Property: What Every Small Business Needs to Know

Intellectual property is often the most valuable asset a small business has — yet it’s also one of the most commonly overlooked. In a comprehensive guide published by the U.S. Chamber of Commerce (CO-), Krazimo CEO Akhil Verghese shares insights from his experience running a technology company on how small businesses can better protect their IP. Verghese highlights a key blind spot: while large companies typically run training courses explaining what’s proprietary when employees join, small companies tend to get straight to work — leaving employees unclear on what is and isn’t privileged information. This cultural gap creates real risk, especially for tech and AI companies where intellectual property is the core of the business. He also addresses the power dynamics that small businesses face when negotiating contracts with larger clients. When you’re a small business, it can be difficult to insist on particular contract terms, especially if the client is a large company. This pressure can lead small businesses to sign away IP rights they should be protecting. The article covers the fundamentals of IP protection — from patents and trademarks to trade secrets and copyrights — and provides actionable steps for businesses at any stage. For AI and technology companies in particular, where proprietary algorithms, training data, and code represent significant competitive advantages, getting IP protection right from the start is essential. Originally published on CO- by the U.S. Chamber of Commerce. Krazimo is an enterprise AI consulting firm founded by former Google engineers, specializing in reliable generative AI solutions. Read the full article on the U.S. Chamber of Commerce Website.

Paid Faster, Paid More – Revolutionizing Restoration

Impact

  • Faster cycle time: Response time to insurer and TPA requests dropped from multiple days to a few hours, including verification.
  • Higher returns: ~6 percent improvement in settlements through more consistent, persistent, standards backed defenses.
  • Less manual work: About 45 hours saved per week for one restoration company.
  • Net value: About 800k in projected annual impact, combining settlement lift plus hours saved.
  • Big market leverage: This pattern scales across 3,000+ restoration companies in the US.

The problem

Restoration is backwards compared to most industries. The work starts immediately (flood, fire, mold), and only after the job is complete does the justification and invoicing battle begin. In practice, carriers and TPAs operate in a “delay, deny, defend” posture that forces restoration teams to prove every decision after the fact.  That proof burden is not simple paperwork. It requires technical precision across standards (IICRC S500, S520, plus state variations), job evidence (photos, moisture logs), and fast, consistent responses. 

The core idea

Automation is only valuable here if it is credible. JSTFYD is built on one central insight: AI can accelerate and improve claim justification only when it is grounded in standards, job level evidence, and historical claim data, with outputs that remain reviewable and auditable. 

What we built

JSTFYD combines three capabilities into one claims platform: standards grounded claim communication, claims aware project management, and estimate comparison plus validation tools.  In the product demo, these show up as dedicated modules (JSTFYD Studio, Compare Estimate, Projects, Inbox), built to match how restoration teams actually work day to day. water damage claims insurance dispute claim denial claim appeal TPA claims insurance TPA restoration billing catastrophe claims invoice dispute claim justification IICRC S500 At a high level, the system is designed so every generated defense can be traced back to the right source material. Four architecture decisions matter most:

1) Hybrid retrieval that is meaning aware and citation accurate

Instead of relying on embeddings alone, JSTFYD uses hybrid retrieval that combines semantic similarity with structured metadata such as standard name, section and page number, document type, and categorized summaries of uploaded items.  This is what keeps citations precise when the dispute hinges on exact standard language.

2) Image aware evidence retrieval

Every uploaded photo is converted into a short description, vectorized, and stored with metadata so it can be retrieved as evidence during a dispute.  The document’s example is explicit: if an insurer challenges equipment placement, the system can retrieve the relevant room photo and use it directly in the response. 

3) A ReAct style agent for multi step claim reasoning

A single agent orchestrates tool calls across standard lookup, image retrieval, evidence compilation, response drafting, and email formatting, which is important when one dispute touches multiple standards and multiple pieces of job evidence. 

4) Citation enforced replies, plus precedent

Every justification email cites the relevant IICRC or state standard, references uploaded evidence, and links comparable past claims.  JSTFYD also indexes past restoration jobs and retrieves precedents when insurers challenge similar usage again, strengthening consistency over time. 

How the workflow runs in practice

Workflow A: Set up a job once, then reuse the evidence forever

JSTFYD structures each job as a project, where teams upload invoices, photos, moisture logs, technician notes, equipment lists, and any supporting evidence.  In the demo, the project flow operationalizes this with a guided intake that asks for required documentation like moisture logs and an Xactimate estimate, with optional photo logs and additional documents.

Workflow B: Dispute email in, defensible draft out

JSTFYD integrates with existing email workflows so teams do not have to change how they communicate. When a dispute email arrives, the platform triggers the justification engine.  In the demo, this appears in the Inbox as claim threads tied to projects, with one click email draft generation. The goal is simple: a faster turnaround, with a response that is grounded in standards and supported by job evidence.

Workflow C: Estimate reductions, compared and rebutted line by line

Insurers often send reduced estimates, sometimes dramatically lower, hoping the restoration company accepts them.  JSTFYD’s comparison engine takes the original Xactimate estimate and the insurer’s revised version, highlights removed or reduced line items, explains why they matter, references standards to defend them, and compiles everything into a ready to send email.  In the demo, this is exposed as a Compare Estimate view that shows line items side by side, flags detected changes, and generates a written justification per item.

Workflow D: Validation before sending

Before any invoice or justification email is sent, the system checks that required evidence is present, verifies alignment with standards, confirms documentation completeness, and flags missing photos or logs.  This prevents weak packets from going out and reduces rework, especially for newer staff.

An expert in your pocket for the whole team

Not everyone at a restoration company is fluent in standards or dispute strategy. JSTFYD includes an AI powered claims expert chat that understands IICRC standards, state specific regulations, the full project data, the invoice and evidence, dispute history, and past claim precedents.  This reduces dependence on scarce internal specialists and helps teams get better at documentation and rebuttals over time.

Why the outcomes moved

The doc summarizes the operational drivers clearly: faster responses, fewer communication delays, reduced manual email workload, and fewer disputes because well supported justifications close arguments faster.  That is the mechanism behind the business outcomes you care about: speed, higher settlement consistency, and large weekly labor savings that compound at scale. JSTFYD turns a chaotic, adversarial workflow into a structured, defensible, and scalable process by grounding AI in IICRC standards, structured evidence, hybrid retrieval, and precedent.  The result is not “AI that writes emails.” It is a claims system that helps restoration companies secure what they rightfully earned.    

Why Gartner Says Enterprises Should Avoid AI Browsers — And What It Means for Your Business

Gartner recently issued a stark warning: enterprises should block AI browsers due to the security risks they pose. These agentic browsing tools can expose sensitive data, undermine long-standing browser protections, and create organization-wide vulnerabilities. But is a blanket ban realistic? In a feature on TechNewsWorld, Krazimo CEO Akhil Verghese offered a candid assessment. While he agrees the security concerns are legitimate, he questions the practicality of Gartner’s advice. AI browsers provide little visibility into what happens to data before it reaches the underlying AI provider, and terms of service can change over time. But expecting individuals or organizations to continuously monitor these shifting policies isn’t realistic either. The article explores the tension between the productivity benefits of AI-enhanced browsing and the genuine enterprise security risks it introduces. As AI browsers become more capable and more common, organizations face a growing challenge: how to capture the benefits of AI-assisted workflows without exposing sensitive data to unknown backend processing. For businesses evaluating AI tools, the takeaway is clear — due diligence on data handling and security practices is essential, but blanket bans may not be the answer. A thoughtful, risk-based approach that includes employee education and clear usage policies is likely more effective. Originally published on TechNewsWorld. Krazimo helps enterprises adopt AI responsibly with a focus on security, reliability, and production-grade engineering. Read the full article at TechNewsWorld.

A Clear Benchmark for Financial Advisors

Key takeaways (impact)

  • Objective benchmarking for advisors: Advisors get clear percentile based positioning, module scores, and strength and weakness profiles, without subjective interviews.
  • Compliance safe by design: Scoring is fully deterministic, and generative AI is not used in the scoring algorithm, which preserves transparency and regulatory credibility.
  • Faster iteration on assessment quality: AI is used upstream to help experts generate and refresh questions, modules, weights, insights, and report templates as markets evolve.
  • Actionable next steps, not just a score: After deterministic scoring, the platform generates context driven action plans grounded in an expert knowledge base.

The problem

Choosing a financial advisor is harder than it should be. Regulations limit what advisors can advertise, the industry lacks standardized evaluation frameworks, and high net worth families often default to trust, referrals, or superficial signals.  That opacity creates four gaps: clients cannot reliably compare advisors, advisors cannot benchmark against peers, firms lack a consistent improvement framework, and matching families to advisors becomes guesswork. 

The key idea

Point93 is built on a simple principle: the evaluation must be deterministic and benchmark aligned, and AI should help design the assessment, not evaluate the people taking it. 

Our solution

Point93 is a structured, multi module self assessment that measures an advisor across capabilities, philosophy, operations, and stewardship, then compares results against peers and expert derived best practices.  The system sits on four pillars: expert knowledge ingestion, AI assisted questionnaire creation, deterministic scoring, and a comprehensive reporting engine.  choose a financial advisor, find a financial advisor, best financial advisor, financial advisor near me, fiduciary financial advisor, fee only financial advisor, financial advisor fees, questions to ask a financial advisor, how to pick a financial advisor, financial advisor

Architecture overview

1) Expert knowledge as the foundation

Point93 starts with practitioner expertise. An experienced advisor provided frameworks, evaluative guidelines, scoring philosophies, operational best practices, risk and compliance considerations, and service quality indicators that form the backbone of the assessment model.  This corpus is processed into a semantic RAG pipeline using vectorization and dot product retrieval, optimized for high precision recall of expert principles when questions and modules are created or refined. 

2) AI assisted assessment creation (upstream, expert controlled)

The questionnaire spans 17 modules, each with 30 to 40 questions, using multiple formats, including multiple choice, rating scales, free form responses, Likert style questions, and scenario based selections.  AI is used heavily in creation to generate initial and replacement questions, update modules, propose scoring weights and point allocation, and produce insight areas, report structures, and feedback templates.  Crucially, this is expert supervised, and knowledge is sourced from the partner advisor, not the public internet. 

3) Deterministic scoring and benchmarking (no generative AI in scoring)

Once an advisor completes the assessment, Point93 applies a fully deterministic scoring engine with defined weights, validated scoring logic, proficiency thresholds, benchmarks from expert knowledge, and comparative markers from peer data.  Outputs include percentile rankings, module level scores, benchmark comparisons, peer charts, weighted aggregate scores, and strength and weakness profiles.  No part of the scoring algorithm involves generative AI, which is a deliberate credibility and regulatory safety decision. 

4) Reporting that is usable, not just “data”

After scoring, advisors receive a detailed report delivered digitally and via email, with radar charts, bar graphs, percentiles, peer overlays, benchmark maps, narrative insights, action items, and strength and risk zones. 

5) AI generated action plans (the only end user facing AI)

After deterministic scoring is complete, AI uses the advisor’s results plus peer averages and benchmarks to propose concrete improvements across operations, strategy, communication, portfolio management, and practice management, grounded in the expert knowledge base. 

How it works, end to end

  1. Experts shape the evaluation foundation: Partner advisor knowledge is ingested into the RAG knowledge base.
  2. Admins iterate the assessment quickly: When creating or refining modules, RAG retrieves the most relevant expert principles, then AI helps draft questions, weights, and templates.
  3. Advisors complete the assessment: 17 modules, 30 to 40 questions each, mixed formats for higher fidelity.
  4. Deterministic scoring runs: Transparent, repeatable scoring and benchmarking, producing percentiles and comparisons.
  5. Report plus action plan is delivered: Visuals, narrative insights, and AI generated improvement plans.

Results and early value

In early usage, the platform delivered clear benchmarking, visibility into operational blind spots, a structured improvement path, and professional grade reports for advisors.  For firms, it provided a standardized evaluation framework, training and quality improvement tooling, identification of top performers and outliers, and consistent onboarding evaluations. 

Lessons learned

  • Deterministic evaluation is essential in regulated industries, since compliance and credibility depend on transparent logic.
  • Quite simply, if there isn’t a clear need for AI, don’t use it. AI belongs upstream in assessment design, not inside the scoring engine.
  • Expert knowledge beats generic internet data for credibility and relevance.
  • Mixed question types improve fidelity beyond MCQs alone.

What’s next

Point93 is designed to evolve into a marketplace for advisor family matching, including AI driven matching, expanded scoring dimensions, reassessment tools, firm level integrations, and enhanced benchmark models.  Point93 was engineered to make advisor evaluation transparent, fair, and future ready by combining expert grounded assessment design with deterministic scoring, benchmarking, and actionable reporting.  If you want, paste the Loom transcript (or upload the video file here) and I will weave in the exact UI flow and screenshots from the demo without adding anything that is not shown.

Should AI Companies Pay for Training Data? Our CEO Weighs In

As India proposes a blanket licensing system that would require AI companies to pay creators when their content is used for model training, the debate over AI training data compensation has reached a critical inflection point. TechRound assembled a panel of tech leaders to weigh in — including Krazimo CEO Akhil Verghese. Verghese’s take is nuanced and thoughtful. He argues that while it may be feasible to compensate large content generators like the New York Times or Reddit, creating a fair system for every blog author whose work contributed to training a state-of-the-art model would be extraordinarily difficult. He identifies three key areas of debate: whether the transformative way AI reuses content constitutes fair use, whether the practical difficulty of compensating everyone fairly means the issue can’t be addressed, and whether AI dominance is so strategically important that legal concerns become secondary. On the fair use question, Verghese is direct: based on how transformers actually work, he finds it difficult to classify AI training data usage as fair use in the traditional sense. He also pushes back on the idea that difficulty justifies inaction — arguing that the brilliant minds who built these models could develop workable compensation structures if they dedicated effort to the problem. The article features perspectives from six industry experts, making it a comprehensive look at one of the most important policy questions in AI today. Originally published on TechRound. Krazimo is an AI consulting firm that builds reliable enterprise AI solutions with a focus on engineering excellence. Read the whole story on TechRound.

From Google Engineer to AI Startup Founder: The Krazimo Origin Story

What does it take to leave a senior engineering role at Google and start an AI consulting company from scratch? In an in-depth interview with Tech Startup Network, Krazimo founder Akhil Verghese tells the full story. Verghese’s journey began at BITS Pilani in India, where he studied physics and civil engineering before pivoting to software. After starting at Fiberlink (later acquired by IBM), he spent years as a machine learning consultant and served as the founding Head of AI at Butter.ai, a startup backed by General Catalyst. In 2019, he joined Google, where he spent six years — ultimately leading reporting projects for Gemini within Google Workspace and advising teams on optimizing LLMs for reliability. That advisory work is what sparked Krazimo. Verghese saw firsthand how even sophisticated companies struggled to deploy AI reliably in high-stakes environments. The gap between a compelling demo and a production-ready system was vast, and most organizations lacked the engineering discipline to bridge it. The interview covers Krazimo’s philosophy of enterprise-grade AI: systems that are creative and intelligent yet remain predictable, testable, and auditable. Verghese explains the company’s signature phased launch strategy — shadow launches, human-in-the-loop validation, and only then full automation — and discusses why engineering rigor matters more than ever in the age of generative AI. Originally published on Tech Startup Network. Krazimo specializes in reliable, enterprise-grade generative AI solutions built by former Google engineers. Read more on the Tech Startup Network.

Why Trust Is the Make-or-Break Factor for Enterprise AI Agents

The promise of agentic AI — autonomous systems that make decisions and execute workflows with minimal human oversight — is enormous. But there’s a catch: if business leaders can’t trust these systems, the technology becomes worthless. In a feature on Geek Insider, Krazimo CEO Akhil Verghese breaks down exactly why trust in enterprise AI is so often lacking, and what companies can do about it. The core problem? A massive gap between flashy AI demos and production-ready agents. As Verghese puts it, many companies are rushing to market with agents that simply aren’t ready for enterprise environments. The article outlines three pillars that businesses should demand from any AI agent provider: Determinism (breaking complex workflows into individually testable steps rather than relying on unpredictable one-shot LLM calls), rigorous Testing (using techniques like LLM-on-LLM reflection and outcome-oriented unit tests), and Phased Launches (progressing from shadow launches to human-in-the-loop validation before full automation). Verghese also shares his outlook on the future: while LLMs will continue to improve and hallucinate less, the biggest growth opportunity lies in better agent-building best practices and tools. For any enterprise considering AI adoption, this article is a roadmap for doing it responsibly and effectively. Originally published on Geek Insider. Krazimo is an enterprise AI solutions provider helping companies leverage generative AI with engineering rigor and reliability. Read more on the GeekInsider.

Was 2025 Really the Year of the AI Agent? Our Take on What’s Next

2025 was supposed to be the year AI agents went mainstream. So did it live up to the hype? In a year-end analysis by SDxCentral, Krazimo CEO Akhil Verghese provides one of the most grounded assessments of where agentic AI actually stands. Verghese’s perspective is both ambitious and pragmatic. He believes 40-70% of all white-collar work will be automatable within three years — but is quick to distinguish between automatable and automated. The gap between what’s technically possible and what’s actually deployed in production is significant, and Verghese suggests a 10-year timeline is more realistic for seeing widespread automation of white-collar work as it exists today. Looking back at 2025, Verghese characterizes it as primarily a testing and experimental phase — and a year of painful lessons for companies that adopted AI solutions without adequate guardrails, success criteria, and maintenance plans. He expects 2026 to continue this pattern of experimentation, with enterprises becoming more sophisticated about how they evaluate and deploy AI. The article draws on perspectives from multiple industry leaders and provides a comprehensive view of the current state of agentic AI adoption. For business leaders planning their AI strategy, the takeaway is clear: the technology is advancing rapidly, but success depends on engineering discipline, realistic expectations, and a willingness to learn from early failures. Originally published on SDxCentral. Krazimo is an enterprise AI consulting firm that helps businesses adopt AI with the rigor and reliability needed for production environments. Read the whole story at SDxCentral.

Empowering Professional AI Translations

Impact

  • ~80% reduction in average hours spent while translating technical documents, without appreciable loss in translation quality (AI was able to flag areas that required human translators with high accuracy).
  • Format preserved end to end: Documents are translated and returned in the same file format and structure that users upload, reducing manual rework.
  • Business ready control: Users can tailor translations with domain and writing style controls so output matches professional context.

Client overview

LinguaCore is a GenAI based translation product built for professional workflows, including professional linguists, translation companies, and localization teams, with an emphasis on subject specific, context aware translation.

The problem

Professional translation businesses hit three scaling limits quickly:
  1. Cost does not scale: High quality technical translation becomes expensive when every page requires full human effort.
  2. Documents are not just text: Technical files contain structure and formatting that must survive translation, or teams lose hours rebuilding layouts.
  3. Context matters: Accurate translation is not enough; the output must match domain conventions and the intended tone for business use.

Goals

  • Provide an end to end product for text and document translation.
  • Preserve formatting and file structure for document outputs.
  • Support domain and writing style controls to make translations usable in professional settings.
  • Reduce operating costs while maintaining quality through targeted human review, guided by the AI.

The solution

LinguaCore delivers subject specific, context aware translations for professional users, with an optional human review layer when needed. The product supports both text translation and document translation, with outputs designed to be immediately usable rather than requiring manual cleanup.architecture of professional translation system.

Architecture overview

1) Two pipelines, one consistent experience

LinguaCore supports two core translation paths:
  • Text translation, optimized for fast iteration and business communication.
  • Document translation, optimized for structured files where preserving layout, formatting, and file type is critical.

2) Format preserving document translation

For documents, the system must handle more than language conversion. It must:
  • Ingest and parse the file while retaining structural elements
  • Translate text segments while keeping context intact
  • Reconstruct the document so the translated file preserves the original format and layout
This is what allows customers to receive translated manuals, PDFs, and business documents without a separate formatting pass.

3) Quality control that makes cost scale

For technical documents, LinguaCore’s core efficiency gain comes from AI assisted quality triage:
  • The AI produces a translation draft
  • The AI self flags uncertain or high risk segments
  • Human reviewers focus only on flagged portions
This selective review loop is the mechanism behind the reported 80 percent plus reduction in operating costs without an appreciable loss in quality.

4) Privacy and security posture

LinguaCore’s privacy policy states it processes personal data under GDPR aligned requirements and references ISO 27001:2013 as an information security standard guiding its security approach.

How it works

Flow A: Text translation

  1. User enters text
  2. User selects language pair plus domain and writing style controls
  3. System generates a translation that matches the chosen context

Flow B: Document translation with formatting preserved

  1. User uploads a document
  2. User selects languages plus domain and writing style controls
  3. Translation runs as a job and is stored for retrieval
  4. User downloads the translated document in the same format, with layout retained

Flow C: Targeted human review for technical content

  1. AI translates the document
  2. AI flags segments likely to need expert review
  3. Reviewer validates and edits only those segments
  4. Final version is delivered with high quality at dramatically lower operating cost

Conclusion

LinguaCore modernizes professional translation by combining context controlled generation, format preserving document translation, and a pragmatic quality loop where AI does the heavy lift and humans intervene only where the system signals it is necessary. The result is a translation workflow that scales economically without sacrificing the standards professional customers expect.