Mitko Vasilev
mitkox
AI & ML interests
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
Recent Activity
replied to
their
post
10 days ago
Can it run DeepSeek V3 671B is the new 'can it run Doom'.
How minimalistic can I go with on device AI with behemoth models - here I'm running DeepSeek V3 MoE on a single A6000 GPU.
Not great, not terrible, for this minimalistic setup. I love the Mixture of Experts architectures. Typically I'm running my core LLM distributed over the 4 GPUs.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
replied to
their
post
10 days ago
Can it run DeepSeek V3 671B is the new 'can it run Doom'.
How minimalistic can I go with on device AI with behemoth models - here I'm running DeepSeek V3 MoE on a single A6000 GPU.
Not great, not terrible, for this minimalistic setup. I love the Mixture of Experts architectures. Typically I'm running my core LLM distributed over the 4 GPUs.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
Organizations
mitkox's activity
posted
an
update
1 day ago
replied to
their
post
10 days ago
DDR5 on HP Z8 G5
replied to
their
post
10 days ago
exactly Q2 med with ~190GB RAM
posted
an
update
10 days ago
Post
2394
Can it run DeepSeek V3 671B is the new 'can it run Doom'.
How minimalistic can I go with on device AI with behemoth models - here I'm running DeepSeek V3 MoE on a single A6000 GPU.
Not great, not terrible, for this minimalistic setup. I love the Mixture of Experts architectures. Typically I'm running my core LLM distributed over the 4 GPUs.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
How minimalistic can I go with on device AI with behemoth models - here I'm running DeepSeek V3 MoE on a single A6000 GPU.
Not great, not terrible, for this minimalistic setup. I love the Mixture of Experts architectures. Typically I'm running my core LLM distributed over the 4 GPUs.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
posted
an
update
7 months ago
Post
2519
I started Friday with decentralized AI using Gemma-2, and it all works without blockchain. This is what I did:
1. Pinned Gemma-2 9B in the Interplanetary Filesystem IPFS with the LoRA fine-tuning adapters.
2. Set up a llama-ipfs server to fetch and cache the model and adapters on the fly and inference locally.
Now, I can use my on device AI platform across:
• All my macOS automation workflows
• All my browsers
• My Copilot++ in VSCode
• My Open Apple Intelligence (OAI, not to be confused with the other closed OAI owned by a nonprofit foundation and BigTech)
The llama-ipfs server’s RPC support lets me decentralize inferencing across all my devices, supercharging computing and energy efficiency.
Make sure you own your AI. AI in the cloud is not aligned with you, it’s aligned with the company that owns it.
1. Pinned Gemma-2 9B in the Interplanetary Filesystem IPFS with the LoRA fine-tuning adapters.
2. Set up a llama-ipfs server to fetch and cache the model and adapters on the fly and inference locally.
Now, I can use my on device AI platform across:
• All my macOS automation workflows
• All my browsers
• My Copilot++ in VSCode
• My Open Apple Intelligence (OAI, not to be confused with the other closed OAI owned by a nonprofit foundation and BigTech)
The llama-ipfs server’s RPC support lets me decentralize inferencing across all my devices, supercharging computing and energy efficiency.
Make sure you own your AI. AI in the cloud is not aligned with you, it’s aligned with the company that owns it.
posted
an
update
7 months ago
Post
2200
I'm decentralizing my AI end2end, from the AI model distribution to on device AI inferencing. llama-ipfs - llama.cpp integrated with Interplanetary File System for distributing peer2peer and loading AI models without the need for cloud storage or AI model Hub.
llama.cpp now supports decentralized inferencing with RPC, allowing the distribution of workload across all home devices. This functionality can be enhanced with a P2P ad-hoc VPN, enabling the extension of distributed inferencing to any device on any network.
Imagine an open-source AI that's as decentralized as a potluck dinner - everyone brings something to the table, and there's ZERO need for blockchain. It's like a digital fortress, with security and privacy baked right in, not to mention a dollop of integrity and trust. This could be the secret sauce for an enterprise AI platform, complete with an integrated IT policy. It might just be the cherry on top for the next generation of Apple Intelligence and Copilot+ PCs.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
llama.cpp now supports decentralized inferencing with RPC, allowing the distribution of workload across all home devices. This functionality can be enhanced with a P2P ad-hoc VPN, enabling the extension of distributed inferencing to any device on any network.
Imagine an open-source AI that's as decentralized as a potluck dinner - everyone brings something to the table, and there's ZERO need for blockchain. It's like a digital fortress, with security and privacy baked right in, not to mention a dollop of integrity and trust. This could be the secret sauce for an enterprise AI platform, complete with an integrated IT policy. It might just be the cherry on top for the next generation of Apple Intelligence and Copilot+ PCs.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
posted
an
update
7 months ago
Post
673
I'm decentralizing my AI. I'll be using Radicle for decentralized Git and IPFS for distributing AI models.
I believe there is a significant opportunity to democratize open AI development moving forward. I appreciate that Radicle is open-source, prioritizes local operations, functions offline, seeds data peer-to-peer from my node, is programmable, and incorporates built-in security features.
IPFS is great decentralized data storage, and I have already begun seeding SLMs and LoRa adapters. Tomorrow will add my collection of LLMs, VLMs, etc models and datasets I'm actively using. I have 10Gbps fiber optics at home so my node has enough bandwidth.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
I believe there is a significant opportunity to democratize open AI development moving forward. I appreciate that Radicle is open-source, prioritizes local operations, functions offline, seeds data peer-to-peer from my node, is programmable, and incorporates built-in security features.
IPFS is great decentralized data storage, and I have already begun seeding SLMs and LoRa adapters. Tomorrow will add my collection of LLMs, VLMs, etc models and datasets I'm actively using. I have 10Gbps fiber optics at home so my node has enough bandwidth.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
posted
an
update
7 months ago
Post
3423
I've made an on device AI comparison between open source, Apple Intelligence, and Microsoft Copilot+ PC. This OS and applications level integration will bring GenAI to everyone, be it consumers or businesses, over the next year.
Communities and BigTech hold divergent visions regarding the problems they aim to solve, ways to lock in users and enterprises, as well as their commercialization and GTM strategies.
I'm aware that this table has the potential to expand into an epic 30-page saga during an in-depth analysis, but hey, it's a beginning. Do you think I should throw in a few more comparisons? I'm all ears for your thoughts and critiques!
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it
Communities and BigTech hold divergent visions regarding the problems they aim to solve, ways to lock in users and enterprises, as well as their commercialization and GTM strategies.
I'm aware that this table has the potential to expand into an epic 30-page saga during an in-depth analysis, but hey, it's a beginning. Do you think I should throw in a few more comparisons? I'm all ears for your thoughts and critiques!
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it
replied to
their
post
7 months ago
The screenshot is a preview of Copilot+ . Using Phi Silica which comes with the Azure Edge AI package.
posted
an
update
7 months ago
Post
2376
Me: I want on device AI: fast, without latency, with real privacy, convenient for use and development.
Microsoft: The best I can do is Copilot+. You need a special Qualcomm chip and Windows 11 24H2. Today I can give you only Recall, taking screenshots and running a visual model to write context about what you are doing in the unencrypted Semantic Index database for embeddings. I'm giving you SLMs Phi Silica, accessible only via API and SDK. In the autumn I can give you the developer tools for C#/C++ and you can use them.
Apple: The best I can do is Apple Intelligence. You need a special Apple chip and macOS 15. Today I can give you only marketing. In the autumn I can give you on-device 3B quantized to 3.5bit mysterious SLMs and diffusion models with LoRA adapters. We will have an encrypted Semantic Index database for embeddings and agentic flows with function calling. We will call all of them with different names. In the autumn I will give you the developer tools in Swift and you can use them.
Open Source: The best I can do is llama.cpp. You can run it on any chip and OS. Today you can run AI inferencing on device and add other open source components for your solution. I can give you local AI models SLMs/LLMs - from wqen2-0.5B to Llama3-70B. You can have an encrypted local embeddings database with PostgreSQL/pgvector or SQLite-Vec. I can give you a wide choice of integrations and open-source components for your solution- from UIs to agentic workflows with function calling. Today I can give you the developer tools in Python/C/C++/Rust/Go/Node.js/JS/C#/Scala/Java and you can use them.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
Microsoft: The best I can do is Copilot+. You need a special Qualcomm chip and Windows 11 24H2. Today I can give you only Recall, taking screenshots and running a visual model to write context about what you are doing in the unencrypted Semantic Index database for embeddings. I'm giving you SLMs Phi Silica, accessible only via API and SDK. In the autumn I can give you the developer tools for C#/C++ and you can use them.
Apple: The best I can do is Apple Intelligence. You need a special Apple chip and macOS 15. Today I can give you only marketing. In the autumn I can give you on-device 3B quantized to 3.5bit mysterious SLMs and diffusion models with LoRA adapters. We will have an encrypted Semantic Index database for embeddings and agentic flows with function calling. We will call all of them with different names. In the autumn I will give you the developer tools in Swift and you can use them.
Open Source: The best I can do is llama.cpp. You can run it on any chip and OS. Today you can run AI inferencing on device and add other open source components for your solution. I can give you local AI models SLMs/LLMs - from wqen2-0.5B to Llama3-70B. You can have an encrypted local embeddings database with PostgreSQL/pgvector or SQLite-Vec. I can give you a wide choice of integrations and open-source components for your solution- from UIs to agentic workflows with function calling. Today I can give you the developer tools in Python/C/C++/Rust/Go/Node.js/JS/C#/Scala/Java and you can use them.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
posted
an
update
7 months ago
Post
2429
I've spent some time checking the promises vs reality of on-device AI between Apple Intelligence and Microsoft Copilot+. Reading the marketing documentation is good, but not enough. Hands-on tests are the best, unfortunately, both are not there yet.
Both are looking to lock developers behind local API to the SLM inferencing engine and SDK mix of open source and proprietary code. Both can not work air-gapped and offline for meaningful workflows, only some basic ones and both require the hybrid AI local/remote plane calling back either APIs on Azure or the Apple Private Cloud Compute.
Some of the Copilot+ functionally is available in Windows App SDK 1.6 exp2. It's focused on the old-school enterprise developers and not sure if they will be the early adaptors of GenAI-backed apps... I still have the Recall on my dev-PC as they have removed it.
Apple Intelligence is hard to get beyond the vague description and the State of the Union video. Even the current beta of macOS 15 and xcode don't have any "A.I." in them. At the moment it is all promises and a lack of technical documentation and code.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
Both are looking to lock developers behind local API to the SLM inferencing engine and SDK mix of open source and proprietary code. Both can not work air-gapped and offline for meaningful workflows, only some basic ones and both require the hybrid AI local/remote plane calling back either APIs on Azure or the Apple Private Cloud Compute.
Some of the Copilot+ functionally is available in Windows App SDK 1.6 exp2. It's focused on the old-school enterprise developers and not sure if they will be the early adaptors of GenAI-backed apps... I still have the Recall on my dev-PC as they have removed it.
Apple Intelligence is hard to get beyond the vague description and the State of the Union video. Even the current beta of macOS 15 and xcode don't have any "A.I." in them. At the moment it is all promises and a lack of technical documentation and code.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.