Sunday, November 7, 2021

Costs of a Real World Ethereum Contract

 GAS PRICE PSA (2017–07–25): The median gas price at the time of writing this article was, and continues to be, in the realm of 20 Gwei. This is far greater than the typical average and safe-low found on EthGasStation (4 and 0.5 Gwei respectively). The median is so high because of bad gas-price defaults found in many wallets. I highly recommend using EthGasStation’s average gas-price or lower in order to not pay high fees and to help drive down the market rate for gas-price.

I previously discussed calculating costs of ethereum smart contracts by taking a look at low level operations called OPCODES in conjunction with the market rate for running those OPCODES (gas-price). The examples given were simple but a bit contrived so I decided to take last week’s analysis and apply it to an actual smart contract from start to finish.

I’m working on a series of simple smart contracts that are free and open for use. We’ll use the first one in this series — Escrow — and do a deep dive on the costs associated with it. 

***





The Contract

A quick background on the contract up for analysis. This contract involves three parties — a sender, a recipient, and an agreed upon arbitrator. The sender initializes the contract with some amount of ether and specifies the recipient, the arbitrator, and an expiration date. If any of the two parties (sender, recipient, or arbitrator) confirm the payment via the confirm function, the funds are released to the recipient. If two confirms are not made prior to the expiration, then the sender can void the escrow agreement and retrieve their funds via the void function call.

The Costs

I deployed an instance of this contract onto the Rinkeby testnet which can be seen on Etherscan. This contract has three transactions associated with it, all of which use the gas-price of 20 Gwei which is the median price seen on Eth Gas Station.

  • The first transaction initializes the contract and deposits 0.5 ether into the contract. Transaction Cost: 0.01072934 Ether ($3.21 at $300/ETH)
  • In the second transaction the sender calls confirm. Transaction Cost: 0.00093492 Ether ($0.28 at $300/ETH)
  • In the third transaction the arbitrator calls confirm, and the funds are dispersed to the recipient. Transaction Cost: 0.00164754 Ether ($0.49 at $300/ETH)

If I were willing to wait about 20 minutes for my transactions to be picked up, I could have gotten these in on the main-net for 0.5 Gwei (according to Eth Gas Station’s “safe low”). At this gas price, deploying the contract costs around $0.07 and the other two transactions cost $0.01 or less at $300/ETH.

Considering this is a fixed cost for up to any amount of Ether held in this escrow contract, these fees seem reasonable. You would likely need to pay your arbitrator some amount to participate, but a lot of the costs and risk for the arbitrator (securely holding and dispersing funds) have been eliminated. You would have to pay solely for the arbitration — the decision — rather than all of the book keeping.

This is a great place to stop if you just wanted the straight up costs on a real world example. The rest of the article is a deep-dive on where exactly these costs come from.

***

Costs to Deploy

In solidity, the primary programming language currently being used for smart contracts, contracts are initialized via a constructor. You can see in the Escrow constructor that we pass in 4 pieces of data — the addresses of the three parties involved and a timestamp in the future at which the contract can be voided.

The initialization transaction for this contract is by far the most expensive operation. It required 536467 gas to deploy the contract and execute the constructor code. At 20 Gwei per gas, deploying the contract cost 0.01072934 ether, or about $3.21 USD at the current exchange rate of $300/ETH.

The Constructor

To further examine the initialization, let’s take a look at the VM trace of the transaction. This shows the OPCODES executed by the EVM in the constructor of the Escrow contract. This accounts for 113539 of the gas used. The most expensive operations were a number of SSTOREs which were used to store the addresses of the actors, the expiration timestamp, and to initialize some memory locations for the actors array and the confirmations mapping. These SSTORE operations alone accounted for 110000 gas, or about 97% of the gas used in the constructor. The rest of the OPCODES were used for checking the validity of the timestamp and business logic around getting everything stored.


What about the rest?

The constructor only accounted for about 20% of all gas used in this transaction. Where was the remaining 80% spent?

In this transaction, we specified a maximum gas allowed to be used (“Gas Limit”) as 1000000 gas. The EVM starts at this number and counts down with each operation to ensure there is enough gas remaining. If the EVM hits zero gas while in the middle of code execution, the transaction fails, changes are undone, and the fee associated with gas is still paid to the miner.

We can see in the VM Trace that we begin the constructor execution at 837872 gas remaining. This means at that 162128 (1000000-837872) or about 30% of the total gas has already been used when we get to the constructor.

What costs so much before we even get to the code execution? First, there is 21000 gas baseline transaction fee. This fee, called Gtransaction in the Yellow Paper, is paid on all transactions in the Ethereum network. Next is an additional 32000 gas (Gcreate) paid because this is a contract creation transaction. We’re at 53000 gas — still 109128 gas unaccounted for before the constructor. The bulk of this gas is used to pay for the size of the transaction data seen in the “Input Data” field on Etherscan. This is 3556 hex “nibbles” or 1778 bytes of data. As seen on page 20 of the Yellow Paper, Gtxdatazero costs 4 gas/byte and Gtxdatanonzero costs 68 gas/byte. We can calculate how many non-zero and zero bytes with the following equation — x * 68 + (1778-x) * 4 = 109128, where x is the number of non-zero bytes and 1778-x is the number of zero bytes. Solving for x gives x = 1594, so there are 1594 non-zero bytes and 184 zero bytes. As a sanity check, I wrote a python script to count the zero and non-zero bytes in the txdata. The numbers add up.

The remaining 50%

This still leaves about 50% of the gas being used after the constructor. Our gas limit started at 1000000. The last instruction of the constructor code left us at 724333 available gas. The transaction used 536467 gas in total so the available gas at the end of the entire transaction was (1000000–536467) 463533 gas. The amount available at the end of the constructor code less the amount available at the end of the transaction equals the amount of gas used after the constructor code finished — (72433–463533) 260800 gas.


At this point, we have paid for the transaction data and for the initialization of the contract, but what about future calls to the contract? Future calls will have to execute code, so that code must live on-chain in the state of the contract itself. Page 9 of the Yellow Paper discusses the cost of the “code deposit” to pay for adding bytecode to the blockchain state. cost = Gcodedeposit * |o| where o is the “runtime bytecode” and Gcodedeposit is 200 gas/byte. The runtime bytecode is the original bytecode sent in the transaction but stripped of the constructor and general initialization code. Because this initialization code is only executed upon initialization, it is extracted out to save you money and to save the node operators storage space.

The Solidity compiler outputs both the bytecode and runtime bytecode. Remix, an in-browser solidity editor/compiler, also shows you these two values under “contract details”. Remix is great for some quick info and sanity checks.

From the compiler output, I see that the size of the bytecode is 1650 bytes, while the size of the runtime bytecode is 1304 bytes. This implies that 346 bytes (1650–1304) are initialization code. The cost of the code deposit on the runtime bytecode is thus 1304 * 200 = 260800 gas which is exactly the amount we expected.

Again, the contract initialization transaction is by far the most expensive operation related to the Escrow contract and likely for most contracts. We are sending a transaction full of bytecode, running constructor code that initializes a number of places in memory, and then paying a fee for all of the code we are leaving on the blockchain in perpetuity. Increasing the size of the blockchain — the globally distributed and replicated database — is and should be expensive.


Costs to Confirm

Confirmation #1

The first confirmation on the Escrow contract simply updates the state to show the user confirmed. The money is still held and not released until the second confirmation.

This transaction had a gas limit of 90000 gas and used 46746 gas. Code execution began at 68728 gas and ended at 43254 gas, so running the code used 68728 — 43254 = 25474 gas. If you take a look at the trace, there is one SSTORE costing 20000 gas. This is to store the fact that the user sent in their confirmation. This is the majority of the code execution cost. The next most expensive operations are a number of SLOADs which load words from memory each costing 200 gas.

Because code execution began at 68728 of the original 90000 gas limit, 90000 – 68728 = 21272 gas was used prior to this. 21000 gas is the base cost for any transaction so there’s a remaining 272 gas to account for. Remember non-zero transaction data costs 68 gas/byte. This transaction had 4 bytes of data costing 4 * 68 = 272 gas.

21272 pre-execution gas plus 25474 execution gas equals 46746 gas, which is the total — all gas accounted for.


Confirmation #2

The second confirmation on the Escrow contract both updates the state to show the user has confirmed and releases the funds to the recipient. We expect this transaction to be a bit more expensive than the first confirmation because it is doing what the first did plus sending ether.

This transaction had a gas limit of 90000 gas and used 82377 gas. Code execution began at 68728 gas and ended at 7623 gas, so running the code used 68728 – 7623 = 61105 gas. The code execution started at exactly the same gas count as the first confirmation so 21272 gas was used prior to code execution, accounting for the 21000 base transaction cost plus 4 bytes of txdata. The amount used in code execution, 61105 gas, plus the amount used prior to code execution, 21272 gas, equals 82377 gas — all gas accounted for.

Let’s take a deeper look at the VM trace. Like the previous transaction, there is one SSTORE to store the fact that this user confirmed. The bulk of the remaining 41005 gas is used on an OPCODE named CALL. This operation used 32400 gas.

In the trace, it looks like 39981 gas is used on a CALL operation, but this is a bit misleading. This is actually just the amount of gas allotted to the CALL operation rather than the total consumed. Gas is allotted to CALL operations rather than simply spent because CALLs can execute code if the receiving address is a contract, and the exact amount of gas required to run the code is not known until executed. We can look at the operation before and after the CALL to calculate how much was actually used. The operation before the CALL ends at 40064 gas, and the operation after begins at 7664 gas. So the CALL operation sandwiched between uses (40064–7664) 40064 – 7664 = 32400 gas.


Let’s take a look at page 20 and 29 of the Yellow Paper to understand why the CALL operation costs what it does. Executing CALL costs 700 gas (Gcall) no matter what. Using CALL to transfer a non-zero amount of ether costs an additional 9000 gas (Gcallvalue). A CALL operation that adds new account to the blockchain state costs an additional 25000 gas (Gnewaccount). In this instance, the CALL did add a new account to the blockchain because the recipient address previously had no value or contract associated with it. A CALL operation with a non-zero value transfer also gets a stipend of 2300 gas (Gcallstipend) that is subtracted from the other costs associated with the operation. So in total

700 (Gcall) + 9000 (Gcallvalue) + 25000 (Gnewaccount) - 2300 (Gcallstipend) = 32400 gas

CALL gas is accounted for. 


The expensive 25000 gas used for Gnewaccount was a bit surprising. If the recipient had previously been a non-empty account, our gas costs would have been significantly smaller. When doing gas calculations/estimates there are a number of complexities like this depending on the current state.

Costs to Void

I’ll leave calculating and analyzing the cost of executing a successful void transaction to the reader. We would expect this operation to cost a similar amount to the second confirmation because it does a value transfer via a CALL operation, but it should cost a bit less because it does not update the state of confirmations via an expensive SSTORE.

Further Notes

It should be noted that the expensive CALL operation in confirm likely should have been avoided entirely. Directly sending funds after an effect is actually an anti-pattern in solidity. The pattern is instead to update state via one function call that allows for a subsequent withdrawal. In the Escrow contract, the second confirmation should change the state of the contract to “confirmed”, disabling calls to void and enabling the recipient to call a new function named withdraw. This would avoid issues related to unexpected fees or operations related to the CALL.

The red flag in this analysis was the fact that we had an operation that could balloon in gas costs due to the unknown nature of what account/code would be called and executed. This unknown can lead to bugs such as the DAO reentrancy bug.

Smart contracts are incredibly powerful but, as we’ve seen, can be quite complex. The languages, tools, and best practices surrounding them are still in their infancy. I urge the community to focus a significant portion of its efforts on building out the underlying tools and architecture to ensure that we can create both correct and affordable smart contracts.


Good Luck.













How to get balance of all accounts in ethereum network using javascript

I have created a set of accounts in private ethereum networks. I am able to check the balance of all these accounts using the following command:







.

> checkAllBalances();

I have created a web page that will display the balance of all the accounts on the click of a button. So, I want to write a javascript to list the balances. But still need Any more ideas on how to do it? 

***

You can write a function that will fetch the account balance using the  eth.getBalance() method. Refer the below code:

function checkAllBalances() { 
var i = 0; 
eth.accounts.forEach( function(e){
     console.log("eth.accounts["+i+"]: " + e + " \tbalance: " + web3.fromWei(eth.getBalance(e), "ether") + " ether"); 
i++; 
})
};

Code well.

Good Luck coding.

Please remember to subscribe me.




Saturday, July 18, 2020

Quant Tradin





1 Year in Quant Tradin



Image for post
Live performance of my best strategy (175% ROI in the last 9 months)
I’m approaching one year since diving full time into quant trading. My business a year ago wasn’t performing too well, and I was hoping for more control over returns — especially for more predictable ROI. That’s how it all started.
I didn’t expect this journey to be as challenging as it’s been — looking back at all the learning, re-learning, programming, re-programming, testing, re-testing, and launching strategies at some point, only to see them fail. However, there are a few strategies that make it through all the processes and become profitable. These winning strategies have some common patterns, which I tried to compile into the following lessons that I have learned over the past year.
Some of the points may appear obvious to you, as a more experienced trader. For me, each individual one was an enormous insight, sometimes followed by a big shift in how I approach the markets. I wish I had known these points beforehand, which might have saved me countless hours. The following lessons are addressed to me and in no meaningful order.

Strategically pick your markets

Trading US stocks, forex, and bonds probably is a bad idea. It is not the wisest choice, due to too much competition with the biggest players. Find your liquidity sweet spot by taking a look at markets that would support your liquidity needs; however, they shouldn’t be magnitudes larger. Play and win in niche markets by learning their rules, rather than trading where big players trade and where the game is much harder. My point is this: a strategy on Philippines stocks likely will be more profitable than the same one on US stocks.

Learn the rules and accept them

I traded a few different markets (in hindsight, I should have stuck with one). Each has different rules and is rigged in its own way. Market makers (or the most dominant players in one market) do everything to win. Assume the markets are rigged, learn the rules, and play by them, but don’t deny them by thinking markets act naturally. Don’t try to “outsmart” the markets; this likely will backfire. Look for traces (behavioral, spoofing, placed orders, and liquidity hunts) that big players leave, and use them for your advantage.

Know your priorities

There’s so much to do in quant trading: strategy development, optimization, backtesting, execution, and risk management. Don’t focus on the wrong things in the beginning — like optimizing parameters. Rather, build very basic MVP versions of each part in the equation and optimize by iterating while in production. A perfectly optimized strategy won’t help if the execution part doesn’t work correctly.

Expect to lose in your first year

Don’t start scaling as soon as you see some initial success, as it may wipe out big parts (40% in my case) of your portfolio. It will take you much more effort to make it back; instead, it’s easier to adapt proper risk measures in the first place. By having an expectation to lose (the first year at least), you won’t be tempted to put more capital than necessary into testing and learning.

Don’t rush with capital, rush with execution

I was too quick in scaling up capital without thinking about risk. On the contrary, I often found myself in analysis paralysis, promising myself to launch a new strategy after “just one more optimization”. I was over-optimizing too much. I should have just launched multiple strategies to see what works first, then optimizing in an ongoing way. Building and optimizing strategies based on theory doesn’t help, if there’s no practical feedback.

Don’t use price stops

I found there are two ways price stops can be used: either not at all or to protect against black swan events (99.9th percentile of the volatility distribution). Instead of price stops, use time stops and proper position sizing. Price stops will, as research shows, destroy a good strategy, simply due to randomness in volatility. The time dimension is much more manageable and predictable than the price dimension of a hypothesis expressed by your trade (both in backtest and in live trading). By using time stops, you are setting a time constraint in terms of how long your hypothesis is valid, which almost always reduces variance (and increases Sharpe ratio).

Know entries and exits

For each trade, know where to enter and where to exit. For me, these are set based on two rules — one being a modified formula of Average True Range. It’s almost a requirement to have pre-defined rules for entries and exits, in order to properly backtest and to know what to expect in live trading.

Know your numbers

For each strategy, you must know the expected value, hit rate, expected drawdown, longest drawdown, expected volatility, variance, Sharpe ratio, standard deviation of returns, skewness of returns, and value at risk. Also, proper bet sizing, risk of ruin, Kelly fraction, and optimal F should be strategically chosen based on how the strategy performs during the backtest.

Make risk management a priority

Wiping out 40% of capital might happen in a day; however, making it back can take many months — if not years. Use proper risk management in the first place, and be aware of the potential risk of ruin due to black swan events. It’s always a good idea to expect the worst case. It shouldn’t be a challenge for your strategies to wake up one day to a -50% market.

Use fewer parameters but know what they do

My best-performing strategy uses only 3 parameters. These are easy to optimize and easy to test for robustness. Know exactly what your parameters do and why they are used. The worst mistake probably is letting an optimization script generate parameter combinations, e.g., slow/fast period for multiple moving average combinations. There surely will be something that looks good on paper/in the backtest, but it is doubtful the same strategy will work in live trading.

Create a good backtest and know the ins and outs

Don’t go with some existing solution (applies to optimization, as well, by the way) — at least not before you have built multiple backtests yourself. You must understand the effect of slippage, fees, sequence of execution events, and different order types. I wrote many backtest scripts, with the first few being very intricate. My most recent version runs on 12 lines of code (mostly parallel computations), which proves, once again, that simplicity wins.

Find a good evaluation metric

Testing how the strategy performs is not enough; you must know what to look for. Of course, I started looking for high annual return. Optimizing Sharpe was better, but also not what I needed (I still wonder to this day why Sharpe ratio is considered the industry standard, since there are much better metrics). Finding the right optimization and evaluation metric is key; otherwise, you’ll build something that completely misses the goal.

Know what to look for in a strategy

To find a good evaluation metric, you need to know what you actually are looking for in a strategy, and it’s based on many personal factors (portfolio size, accepted risk, etc.). Know the characteristics of your desired strategy, as this will define what evaluation metric to choose. I prefer consistent, negatively skewed strategies, and this is what I build for.

Focus on features, not on optimization

There’s a great range of tools for optimization, genetic optimizations, non-convex optimizations, principal component analysis, statistical/Bayesian optimization, and a thousand fancy libraries. From my point of view, optimization will help improve a strategy by 10–20%; however, it won’t lead to a profitable strategy in the first place. If a strategy is bad, no optimization will help. Focus on deductive analysis and feature engineering — in simple terms, making sense out of the inputs and data.

Deep learning is overrated

I don’t get the hype. Machine learning is great, and deep learning is great, too (aka neural networks). Optimizing 10,000 parameters likely will only lead to overfitting. If the strategy isn’t working without power libraries like Tensorflow, it probably will not work in production (even if the backtest is amazing). In short, the strategy should be profitable already with something simple like a linear regression.

Better data, better features

“Data is the oil of the digital world,” someone said. I heard some hedge funds are using satellite images of parking spaces to predict stock returns. Although such kinds of data probably contain limited information (I’d guess just as good as weather forecast data), it’s still usable data and not bad by any means. My point is: focus on getting better data to then produce better features. Combine multiple weak features and strategies together, and this likely will improve returns.

Academic papers are great, but…

Academic papers are great, but they usually fall short in terms of practicality. There’s simply a mismatch in incentives between academics and traders. Academics are not traders, and traders don’t publish working strategies. Look at academic research with a grain of salt; however, don’t neglect it completely. I can’t count how many times I found a little piece of information really valuable. One paper won’t result in a good strategy, but applying insights from 20 papers might.

Fast feedback is a must

Competing with big players, especially in the HFT realm, is probably a bad idea (like I explained above). Going the opposite — holding trades for days, months, and years — also is not ideal. For me, the sweet spot is a holding time of 5 to 60 minutes. If I can’t test a strategy in 2 weeks with statistical significance (meaning more than 100 trades), I’m not investing my time in it. To test a strategy with a multi-day (and longer) holding period, I’d need months to validate, and this is not what I’m interested in.

Don’t trade only on price and price indicators

Price is a reflection of what happened in the market. It has low information about actors and their intentions. Indicators also are not useful enough, as they are just a derivation from price, and most indicators lag behind. Markets are more choppy and automated nowadays, and lagging indicators aren’t as helpful as I’d assume they were in the past.

Derivations of derivations are useful

I found a way to make indicators useful: by building features off of them. I have found that strategies would perform a lot better if indicators — let’s say the moving average — are strategically refactored into something like a second derivation, e.g., splitting MA values into bins and counting the occurrences per bin over the last X hours.

Double your timeframe

Picking a higher timeframe almost always leads to better results. This can’t repeat indefinitely, as your research is done with one specific timeframe in mind. However, if your strategy is optimized for 15 minutes, increasing the holding period from to 30 minutes almost always yields better returns with lower risk.

More risky markets, smaller positions

Trade in markets that are more volatile, because volatility is good for opportunity. Just be aware of the risks and adjust position size accordingly. It can be much more profitable to trade markets that are 10x as volatile while having 1/10th of the position. The risk to reward curve is not as linear as I thought — looking at you Bitcoin!

Trading fees make a huge difference

By doubling the holding time as mentioned, the role of fees already is reduced. Optimizing strategies specifically to avoid large fees is even smarter. Depending on the strategy (especially for higher frequencies), fees make up more than 50 percent of returns. This means optimizing fees should be one of the highest priorities, whether that means using less market orders, using better brokers, or negotiating better deals with existing brokers.

Familiarize yourself with your trading environment

Mentioned above under becoming more familiar with one niche market, this applies even more so to brokers, exchanges, their APIs, downtimes, and latencies. You should know their APIs in and out, especially since many brokers have intricate and hidden functionalities that can really help one’s performance (conditional orders, better fill/status information on orders, bulk operations, etc.).

Afterthoughts

Thanks for reading. The amount of things I’m learning day by day is not slowing down, even though I’m approaching the 2000 hours mark for quant trading. I think this is one of the few industries where, with passing time, the learning curve gets steeper, which actually makes me excited about the coming months/years. Finally, in case I missed something — or if you want to get in touch — please reach out to me via email.